Function noLZSS::count_factors_fasta_dna_w_rc_per_sequence
Defined in File fasta_processor.cpp
Function Documentation
-
FastaPerSequenceCountResult noLZSS::count_factors_fasta_dna_w_rc_per_sequence(const std::string &fasta_path, FastaDnaSanitizationMode sanitization_mode)
Counts per-sequence factors from DNA factorization with reverse complement.
Reads a FASTA file and factorizes each sequence independently with reverse complement awareness, returning per-sequence counts along with sequence metadata and the aggregate total.
Note
Memory-efficient - only counts factors without storing them
Note
Only A, C, T, G nucleotides are allowed (case insensitive)
- Parameters:
fasta_path – Path to the FASTA file containing DNA sequences
- Throws:
std::runtime_error – If FASTA file cannot be opened or contains no valid sequences
std::invalid_argument – If invalid nucleotides found
- Returns:
Result containing sequence IDs, per-sequence counts, and the total count