Learn R Programming

alakazam (version 1.2.1)

maskPositionsByQuality: Mask sequence positions with low quality

Description

maskPositionsByQuality will replace positions that have a sequencing quality score lower that min_quality with an "N" character.

Usage

maskPositionsByQuality(
  data,
  min_quality = 70,
  sequence = "sequence_alignment",
  quality_num = "quality_alignment_num"
)

Value

Modified data data.frame with an additional field containing quality masked sequences. The name of this field is created concatenating the sequence name and "_masked".

Arguments

data

data.frame containing sequence data.

min_quality

minimum quality score. Positions with sequencing quality less than min_qual will be masked.

sequence

column in data with sequence data to be masked.

quality_num

column in data with quality scores (a string of numeric values, comma separated) that can be used to mask sequence.

See Also

readFastqDb and getPositionQuality

Examples

Run this code
db <- airr::read_rearrangement(system.file("extdata", "example_quality.tsv", package="alakazam"))
fastq_file <- system.file("extdata", "example_quality.fastq", package="alakazam")
db <- readFastqDb(db, fastq_file, quality_offset=-33)
maskPositionsByQuality(db, min_quality=90, quality_num="quality_alignment_num")

Run the code above in your browser using DataLab