Learn R Programming

dada2 (version 1.0.3)

isPhiX: Determine if input sequence(s) match the phiX genome.

Description

This function compares the word-profile of the input sequences to the phiX genome, and the reverse complement of the phiX genome. If enough exactly matching words are found, the sequence is flagged.

Usage

isPhiX(seqs, wordSize = 16, minMatches = 2, nonOverlapping = TRUE)

Arguments

seqs
(Required). A character vector of A/C/G/T sequences.
wordSize
(Optional). Default 16. The size of the words to use for comparison.
minMatches
(Optional). Default 2. The minimum number of words in the input sequences that must match the phiX genome (or its reverse complement) for the sequence to be flagged.
nonOverlapping
(Optional). Default TRUE. If TRUE, only non-overlapping matching words are counted.

Value

logical(1). TRUE if sequence was fount to match the phiX genome.

See Also

fastqFilter, fastqPairedFilter

Examples

Run this code
derep1 = derepFastq(system.file("extdata", "sam1F.fastq.gz", package="dada2"))
sqs1 <- getSequences(derep1)
isPhiX(sqs1)
isPhiX(sqs1, wordSize=20,  minMatches=1)

Run the code above in your browser using DataLab