MSER generally decreases with increasing sequencing depth. This
function interpolates the dependency of MSER on tag counts as a
log-log linear function. The log-log fit is used to estimate the depth
of sequencing required to reach desired target.fold.enrichment
.
get.mser.interpolation(signal.data,
control.data,
target.fold.enrichment = 5,
n.chains = 10,
n.steps = 6,
step.size = 1e+05,
chains = NULL,
test.agreement = 0.99,
return.chains = F,
enrichment.background.scales = c(1),
excluded.steps = c(seq(2, n.steps - 2)), ...)
signal chromosome tag vector list
control chromosome tag vector list
target MSER for which the depth should be estimated
number of steps in each subset chain.
Either number of tags or fraction of the dataset
size, see step.size
parameter for get.mser
.
Fraction of the detected peaks that should
agree between the full and subsampled datasets. See test.agreement
parameter for get.mser
number of random subset chains
optional structure of pre-calculated chains
(e.g. generated by an earlier call with return.chains=T
.
whether to return peak predictions calculated on
random chains. These can be passed back using chains
argument
to skip subsampling/prediction steps, and just recalculate the depth
estimate for a different MSER.
see enrichment.background.scales
parameter for get.mser
Intermediate subsampling steps that should be excluded from the chains to speed up the calculation. By default, all intermediate steps except for first two and last two are skipped. Adding intermediate steps improves interpolation at the expense of computational time.
additional parameters are passed to get.mser
Normally reurns a list, specifying for each backgroundscale:
estimated sequencing depth required to reach specified target MSER
linear fit model, a result of lm()
call
If return.chains=T, the above structure is returned under interpolation field, along with chains field containing results of find.binding.positions calls on subsampled chains.
To simulate sequencing growth, the method calculates peak predictions on random chains. Each chain is produced by sequential random subsampling of the original data. The number of steps in the chain indicates how many times the random subsampling will be performed.