enrichedRegions(sample1, sample2, regions, minReads=10, mappedreads,
pvalFilter=0.05, exact=FALSE, p.adjust.method='none', twoTailed=FALSE,
mc.cores=1)
IRangesList
, RangedData
or IRanges
object),
of RangedDataList
with sequences for all samples
(sample2
must be left missing in this case) .regions
. If not specified, the regions are
automatically defined using the argument minReads
.regions
is not
specified. The regions to be tested for enrichment are those with coverage
greater or equal than minReads
.
If sample1
is a RangedDataList
, the overall coverage
adding all samples is used.
Otherwise, if twoTailed
is FALSE, only the reads in
sample 1 are counted. If twoTailed
is TRUE, the sum of reads
in samples 1 and 2 are counted.pvalFilter
are reported as being enriched.sample1
is a
RangedDataList
object, Fisher's exact test otherwise),
i.e. when the asymptotic chi-square/likelihood-ratio test calculations break down. Ignored
if sample2
is missing, as in this case calculations are
always exact.p.adjust
.sample2
is missing.mc.cores
is greater than 1, computations are
performed in parallel for each element in the
IRangesList
objects. Whenever possible the mclapply
function is used, therefore exactly mc.cores
are used. For
some signatures mclapply
cannot be used, in which case the parallel
function from package
parallel
is used. Note: the latter option launches as many parallel
processes as there are elements in x
, which can place strong
demands on the processor and memory.RangedData
indicating the significantly enriched regions, the number of reads in
each sample for those regions, the fold changes (adjusted considering
the overall number of sequences in each sample) and the chi-square
test P-values.
signature(sample1 = "missing", sample2 =
"missing", regions = "RangedData")
ranges(regions)
indicates the chromosome, start and end of genomic regions, while values{regions}
should
indicate the observed number of reads for each group in each
region. enrichedRegions
tests the null hypothesis that the
proportion of reads in the region is equal across all groups via a
likelihood-ratio test (or permutation-based chi-square for regions
where the expected counts are below 5 for some group). signature(sample1 = "RangedDataList", sample2 =
"missing", regions = "missing")
sample1
contains the read
start/end of an individual sample. enrichedRegions
identifies
regions with high concentration of reads (across all samples) and
then compares the counts across groups using a likelihood-ratio test
(or permutation-based chi-square for regions
where the expected counts are below 5 for some group).signature(sample1 = "RangedData", sample2 = "RangedData",
regions = "missing")
space(sample1)
indicates the chromosome, start(sample1)
and
end(sample1)
the start/end position of the reads. Similarly for
sample2
. enrichedRegions
identifies regions with high
concentration of reads (across all samples) and then compares the
counts across groups using a likelihood-ratio test (or
permutation-based chi-square for regions where the expected counts are
below 5 for some group).signature(sample1 = "RangedData", sample2 = "missing",
regions = "missing")
space(sample1)
indicates the chromosome, start(sample1)
and end(sample1)
the start/end position of the reads.
enrichedRegions
tests the null hypothesis that an unusually high proportion of reads has been
observed in the region using an exact binomial test.sample2
is missing or not.
Non-missing sample2
case.
First, regions with coverage above minReads
are
selected. Second, the number of reads falling in the selected regions
are computed for sample 1 and sample 2.
Third, the counts are compared via a chi-square test (with Yates
continuity correction), which takes into
account the total number of sequences in each sample.
Finally, statistically significant regions are selected and returned
in RangedData
or RangedDataList
objects. Missing sample2
. First, regions with coverage above minReads
are selected.
Second, the number of reads in sample 1 falling in the selected regions is computed.
Third, the proportion of reads in each region is tested for enrichment via a one-tailed Binomial exact test.
set.seed(1)
st <- round(rnorm(1000,500,100))
strand <- rep(c('+','-'),each=500)
space <- rep('chr1',length(st))
sample1 <- RangedData(IRanges(st,st+38),strand=strand,space=space)
st <- round(rnorm(1000,1000,100))
sample2 <- RangedData(IRanges(st,st+38),strand=strand,space=space)
enrichedRegions(sample1,sample2,twoTailed=TRUE)
Run the code above in your browser using DataLab