Function noLZSS::factorize_fasta_multiple_dna_w_rc
Defined in File fasta_processor.cpp
Function Documentation
-
FastaFactorizationResult noLZSS::factorize_fasta_multiple_dna_w_rc(const std::string &fasta_path)
Factorizes multiple DNA sequences from a FASTA file with reverse complement awareness.
Reads a FASTA file containing DNA sequences, parses them into individual sequences, prepares them for factorization using prepare_multiple_dna_sequences_w_rc(), and then performs noLZSS factorization with reverse complement awareness.
Reads a FASTA file containing DNA sequences, parses them into individual sequences, prepares them for factorization using prepare_multiple_dna_sequences_w_rc(), and then performs noLZSS factorization with reverse complement awareness.
Note
Only A, C, T, G nucleotides are allowed (case insensitive)
Note
Sequences are converted to uppercase before factorization
Note
Reverse complement matches are supported during factorization
Note
Nucleotide validation is performed by prepare_multiple_dna_sequences_w_rc()
- Parameters:
fasta_path – Path to the FASTA file containing DNA sequences
- Throws:
std::runtime_error – If FASTA file cannot be opened or contains no valid sequences
std::invalid_argument – If too many sequences (>125) in the FASTA file or invalid nucleotides found
- Returns:
FastaFactorizationResult containing factors and sentinel factor indices