Function noLZSS::prepare_multiple_dna_sequences_no_rc

Function Documentation

PreparedSequenceResult noLZSS::prepare_multiple_dna_sequences_no_rc(const std::vector<std::string> &sequences)

Prepares multiple DNA sequences for factorization without reverse complement.

Prepares multiple DNA sequences for factorization without reverse complement and tracks sentinel positions.

Takes multiple DNA sequences, concatenates them with unique sentinels. Unlike prepare_multiple_dna_sequences_w_rc(), this function does not append reverse complements. The output format is: S = T1!T2@T3$

Prepares multiple DNA sequences for factorization without reverse complement and tracks sentinel positions.

Takes multiple DNA sequences, concatenates them with unique sentinels, and tracks sentinel positions. Unlike prepare_multiple_dna_sequences_w_rc(), this function does not append reverse complements. The output format is: S = T1!T2@T3$

Takes multiple DNA sequences, concatenates them with unique sentinels, and tracks sentinel positions. Unlike prepare_multiple_dna_sequences_w_rc(), this function does not append reverse complements. The output format is: S = T1!T2@T3$

Note

Sentinels range from 1-251, avoiding 0, A(65), C(67), G(71), T(84)

Note

The function validates that all sequences contain only valid DNA nucleotides

Note

Input sequences can be lowercase or uppercase, output is always uppercase

Note

Sentinels range from 1-251, avoiding 0, A(65), C(67), G(71), T(84)

Note

The function validates that all sequences contain only valid DNA nucleotides

Note

Input sequences can be lowercase or uppercase, output is always uppercase

Note

Sentinels range from 1-251, avoiding 0, A(65), C(67), G(71), T(84)

Note

The function validates that all sequences contain only valid DNA nucleotides

Note

Input sequences can be lowercase or uppercase, output is always uppercase

Parameters:
  • sequences – Vector of DNA sequence strings (should contain only A, C, T, G)

  • sequences – Vector of DNA sequence strings (should contain only A, C, T, G)

  • sequences – Vector of DNA sequence strings (should contain only A, C, T, G)

Throws:
  • std::invalid_argument – If too many sequences (>251) or invalid nucleotides found

  • std::runtime_error – If sequences contain invalid characters

  • std::invalid_argument – If too many sequences (>250) or invalid nucleotides found

  • std::runtime_error – If sequences contain invalid characters

  • std::invalid_argument – If too many sequences (>250) or invalid nucleotides found

  • std::runtime_error – If sequences contain invalid characters

Returns:

Pair containing: (concatenated_string, total_length)

  • concatenated_string: The formatted string with sequences and sentinels

  • total_length: Total length of the concatenated string (same as concatenated_string.length())

Returns:

PreparedSequenceResult containing:

  • prepared_string: The formatted string with sequences and sentinels

  • original_length: Total length of the concatenated string (same as prepared_string.length())

  • sentinel_positions: Positions of all sentinels in the prepared string

Returns:

PreparedSequenceResult containing:

  • prepared_string: The formatted string with sequences and sentinels

  • original_length: Total length of the concatenated string (same as prepared_string.length())

  • sentinel_positions: Positions of all sentinels in the prepared string