qMeth(proj, query=NULL, reportLevel=c("C","alignment"), mode=c("CpGcomb","CpG","allC","var"), collapseBySample=TRUE, collapseByQueryRegion=FALSE, asGRanges=TRUE, mask=NULL, reference="genome", keepZero=TRUE, mapqMin=0L, mapqMax=255L, clObj=NULL)
qProject
object from a bisulfite sequencing experimentGRanges
object with the regions to be
quantified. If NULL
, all available target sequences (e.g. the
whole genome) will be analyzed. Available target sequences are
extracted from the header of the first bam file.reportLevel
=CCpGcomb
: only C's in CpG context (strands combined)
CpG
: only C's in CpG context (strands separate)
allC
: all C's (strands separate)
var
: variant detection (all C's, strands separate)
CpGcomb
is the default.
TRUE
, combine (sum) counts from
bamfiles with the same sample name.TRUE
, combine (sum) counts for
all cytosines contained in the same query region.TRUE
, return results as a GRanges
object;
if FALSE
, the results are returned as a data.frame
.GRanges
object with genomic regions to
be masked, i.e. excluded from the analysis (e.g. unmappable regions).auxiliaryFile
argument of qAlign
.FALSE
, only cytosines covered by at least
one alignment will be returned; keepZero
must be TRUE
if multiple samples have the same sample name and
collapseBySample
is TRUE
.mapqMin
). Valid values are between 0 and 255. The default (0)
will include all alignments.mapqMax
).
Valid values are between 0 and 255. The default (255) will include
all alignments.reportLevel
=C, a GRanges
object if
asGRanges
=TRUE
, otherwise a data.frame
.Each row contains the coordinates of individual cytosines for
collapseByQueryRegion
=FALSE
or query regions
for collapseByQueryRegion
=TRUE
.In addition to the coordinates columns (or seqnames
,
ranges
and strand
slots for GRanges
objects),
each row contains per bam file:Two values (total and methylated events, with suffixes _T and _M), or
if the qProject
object was created including a SNP table,
six values (total and methylated events for Reference, Unknown and
Alternative genotypes, with suffixed _TR, _TU, _TA, _MR, _MU and _MA).
In the latter case, C's or CpG's that overlap with SNPs in the table
are removed.If collapseBySample
=TRUE
, groups of bam files with
identical sample name are combined (summed) and will be represented by
a single set of total and methylated count columns.If mode
=var, the _T and _M columns correspond to total
and matching alignments overlapping the guanine paired to the
cytosine.For reportLevel
=alignment, a list
with one
element per bam file or sample (depending on
collapseBySample
). Each list element is another list with the
elements:
aid
: character vector with unique alignment identifiers
Cid
: integer vector with genomic coordinate of C base
strand
: character vector with the strand of the C base
meth
: integer vector with methylation state for
alignment and C defined by aid
and Cid
. The values are
1 for methylated or 0 for unmethylated states.
qMeth
can be used on a qProject
object from a bisulfite
sequencing experiment (sequencing of bisulfite-converted DNA), such as
the one returned by qAlign
when its parameter bisulfite
is
set to a different value than no. qMeth
quantifies DNA methylation by counting total and
methylated events for individual cytosines, using the alignments that
have been generated in converted (three-letter) sequence space for
example by qAlign
. A methylated event corresponds to a C/C
match in the alignment, an unmethylated event to a T/C mismatch (or G/G
matches and A/G mismatches on the opposite strand). For paired-end
samples, the part of the left fragment alignment that overlaps
with the right fragment alignment is ignored, preventing the
use of redundant information coming from the same molecule.
Both directed (bisulfite
=dir) and undirected
(bisulfite
=undir) experimental protocols are supported
by qAlign
and qMeth
.
By default, results are returned per C nucleotide. If
reportLevel
=alignment, results are reported separately
for individual alignments. In that case, query
has to be a
GRanges
object with exactly one region, mode
has to be
either CpG or allC, the arguments
collapseByQueryRegion
, asGRanges
, mask
and
keepZero
have no effect and allele-specific projects are
treated in the same way as normal (non-allele specific) projects.
Using the parameter mode
, quantification can be limited to
cytosines in CpG context, and counts obtained for the two cytosines on
opposite strands within a single CpG can be combined (summed).
The quantification of methylation for all cytosines in the query region(s)
(mode
=allC) should be done with care, especially for
large query regions, as the return value may require a large amount of
memory.
If mode
is set to var, qMeth
only counts reads
from the strand opposite of the cytosine and reports total and
matching alignments. For a position identical to the reference
sequence, only matches (and very few sequencing errors) are
expected, independent on the methylation state of the cytosine. A
reduced fraction of alignments matching the reference are indicative
of sequence variations in the sequenced sample.
mapqMin
and mapqMax
allow to select alignments
based on their mapping qualities. mapqMin
and mapqMax
can
take integer values between 0 and 255 and equal to
$-10
log10 Pr(mapping position is wrong)$, rounded to the nearest
integer. A value 255 indicates that the mapping quality is not available.
If an object that inherits from class cluster
is provided to
the clObj
argument, for example an object returned by
makeCluster
from package parallel,
the quantification task is split into multiple chunks and processed in
parallel using clusterApplyLB
from package
parallel. Not all tasks will be efficiently parallelized: For
example, a single query region and a single (group of) bam files will
not be split into multiple chunks.
qAlign
,
makeCluster
from package parallel
# copy example data to current working directory
file.copy(system.file(package="QuasR", "extdata"), ".", recursive=TRUE)
# create alignments
sampleFile <- "extdata/samples_bis_single.txt"
genomeFile <- "extdata/hg19sub.fa"
proj <- qAlign(sampleFile, genomeFile, bisulfite="dir")
proj
# calculate methylation states
meth <- qMeth(proj, mode="CpGcomb")
meth
Run the code above in your browser using DataLab