Function noLZSS::factorize_file_multiple_dna_w_rc
Defined in File factorizer.cpp
Function Documentation
-
std::vector<Factor> noLZSS::factorize_file_multiple_dna_w_rc(const std::string &path, size_t reserve_hint)
Factorizes DNA text from a file with reverse complement awareness for multiple sequences and returns factors as a vector.
Factorizes DNA text from a file with reverse complement awareness for multiple sequences into noLZSS factors.
This function reads DNA text from a file, performs factorization with reverse complement for multiple sequences, and returns all factors in a vector. The reserve_hint parameter can improve performance when you have an estimate of the number of factors.
Reads DNA text from a file containing multiple sequences and performs noLZSS factorization considering both forward and reverse complement matches. This is more memory-efficient for large genomic files with multiple sequences.
See also
factorize_multiple_dna_w_rc() for in-memory factorization
See also
factorize_multiple_dna_w_rc() for in-memory factorization
Note
Use reserve_hint for better performance when you know approximate factor count
Note
This is more memory-efficient than factorize_multiple_dna_w_rc() for large files
Note
Use reserve_hint for better performance when you know approximate factor count
- Parameters:
path – Path to input file containing DNA text with multiple sequences
reserve_hint – Optional hint for reserving space in output vector (0 = no hint)
path – Path to the input file containing DNA text with multiple sequences
reserve_hint – Optional hint for reserving space in the output vector (0 = no hint)
- Returns:
Vector containing all factors from the factorization
- Returns:
Vector of Factor objects representing the factorization