demultiplexReads: Performs MID/Multiplex filtering

Description

Roche's Genome Sequencer allows to load two or more samples on one region. To allocate sequences to samples, each sample has a unique multiplex sequence. The multiplex sequence should be the prefix of all sequences from that sample. This method demultiplexes a given set of sequences according to the given multiplex sequences (MIDs).

Usage

"demultiplexReads"(reads, mids, numMismatches, trim)

Arguments

reads

A DNAStringSet instance that contains reads starting with MIDs

mids

A DNAStringSet instance that contains the MIDs

numMismatches

The maximal number of mismatches allowed, default 2.

trim

Whether the MIDs should be cutted-out, default TRUE

Value

returns a list with one DNAStringSet instance for each MID.

Details

All given MIDs must have the same length. The algorithm computes the number of mismachtes for each MID. The read is assigned to the MID with the lowest number of mismatches. If two or more MIDs have the same number of mismachtes, or if the number of mismachtes is greater than the given argument numMismachtes, the read is not assigned to any MID. The default number of allowed mismatches is 2.

Examples

Run this code

	library(Biostrings)
    mids = genomeSequencerMIDs(c("MID1", "MID2", "MID3"))
    reads = DNAStringSet(c(
        paste(as.character(mids[["MID1"]]), "A", sep=""),
        paste(as.character(mids[["MID1"]]), "AA", sep=""),
        paste(as.character(mids[["MID2"]]), "AAA", sep="")))
    demultiplexReads(reads, mids)