maskSequences
Mask codons split by insertions in V genemaskSequences
Mask codons split by insertions in V gene
maskSequences(
data,
sequence_id = "sequence_id",
sequence = "sequence",
sequence_alignment = "sequence_alignment",
v_sequence_start = "v_sequence_start",
v_sequence_end = "v_sequence_end",
v_germline_start = "v_germline_start",
v_germline_end = "v_germline_end",
junction_length = "junction_length",
keep_alignment = FALSE,
keep_insertions = FALSE,
mask_codons = TRUE,
mask_cdr3 = TRUE,
nproc = 1
)
A tibble with masked sequence in sequence_masked column, as well as other columns.
BCR data table
sequence id column
input sequence column (query)
aligned (IMGT-gapped) sequence column (subject)
V gene start position in sequence
V gene end position in sequence
V gene start position in sequence_alignment
V gene end position in sequence_alignment
name of junction_length column
store alignment of query and subject sequences?
return removed insertion sequences?
mask split codons?
mask CDR3 sequences?
number of cores to use
Performs global alignment of sequence and sequence_alignment, masking codons in sequence_alignment that are split by insertions (see examples) masking_note notes codon positions in subject_alignment sequence that were masked, if found. subject_alignment contains subject sequence aligned to query sequence (only if keep_alignment=TRUE) query_alignment contains query sequence aligned to subject sequence (only if keep_alignment=TRUE) sequence_masked will be NA if frameshift or alignment error detected. This will be noted insertions column will be returned if keep_insertions=TRUE, contains a comma-separated list of each <position in query alignment>-<sequence>. See example. in masking_note.
maskCodons, Biostrings::pairwiseAlignment.