Learn R Programming

R453Plus1Toolbox (version 1.22.0)

removeLinker: Remove linker sequences located at the start of short reads

Description

If linkers are attached during sample preparation, it may be useful to remove the linkers' sequences after sequencing. This method finds and removes linker sequences that are located at the start of the given reads.

Usage

"removeLinker"(reads, linker, removeReadsWithoutLinker, minOverlap, penalty)

Arguments

reads
A DNAStringSet instance that contains reads possibly having linkers at their start site
linker
A DNAString instance with the linker's sequence
removeReadsWithoutLinker
Whether reads without linkers should be removed. Default is FALSE
minOverlap
The minimal score that must be achived when aligning the linker. Default is length(linker)/2
penalty
The penalty for substitutions or indels. Default is 2

Value

returns a DNAStringSet with trimmed reads.

Details

The best alignment of the linker within the start (length of linker + 5) of each given sequence is computed. The followong scoring schema is used: Each matching bases scores +1. Each substitution or indel scores the given penalty argument (default: penalty=2). There are no penalties for gaps and the end of the linker (overlap). An alignment is considered as match, if the scores is larger of equal to minOverlap (default: minOverlap=round(length(linker)/2)). In cases of a successful match, the subsequence from position 1 until the end of the linker's alignment is removed.

See Also

sequenceCaptureLinkers, DNAStringSet, pairwiseAlignment

Examples

Run this code
    linker = sequenceCaptureLinkers()[[1]]
    reads = DNAStringSet(c(
        "CTCGAGAATTCTGGATCCTCAAA",
             "GAATTCTGGATCCTCAAA",
        "CTCGAGAAAAAAAAATCCTCAAA"))
    removeLinker(reads, linker)

Run the code above in your browser using DataLab