Function noLZSS::factorize_file_multiple_dna_w_rc

Function Documentation

std::vector<Factor> noLZSS::factorize_file_multiple_dna_w_rc(const std::string &path, size_t reserve_hint)

Factorizes DNA text from a file with reverse complement awareness for multiple sequences and returns factors as a vector.

Factorizes DNA text from a file with reverse complement awareness for multiple sequences into noLZSS factors.

This function reads DNA text from a file, performs factorization with reverse complement for multiple sequences, and returns all factors in a vector. The reserve_hint parameter can improve performance when you have an estimate of the number of factors.

Reads DNA text from a file containing multiple sequences and performs noLZSS factorization considering both forward and reverse complement matches. This is more memory-efficient for large genomic files with multiple sequences.

See also

factorize_multiple_dna_w_rc() for in-memory factorization

See also

factorize_multiple_dna_w_rc() for in-memory factorization

Note

Use reserve_hint for better performance when you know approximate factor count

Note

This is more memory-efficient than factorize_multiple_dna_w_rc() for large files

Note

Use reserve_hint for better performance when you know approximate factor count

Parameters:
  • path – Path to input file containing DNA text with multiple sequences

  • reserve_hint – Optional hint for reserving space in output vector (0 = no hint)

  • path – Path to the input file containing DNA text with multiple sequences

  • reserve_hint – Optional hint for reserving space in the output vector (0 = no hint)

Returns:

Vector containing all factors from the factorization

Returns:

Vector of Factor objects representing the factorization