vsn2
fits the vsn model to the data
in x
and returns a vsn
object with
the fit parameters and the transformed data matrix.
The data are, typically, feature intensity readings from a
microarray, but this function may also be useful for other kinds of
intensity data that obey an additive-multiplicative error model.
To obtain an object of the same class as x
, containing
the normalised data and the same metdata as x
, use
fit = vsn2(x, ...) nx = predict(fit, newdata=x)or the wrapper
justvsn
.
Please see the vignette Introduction to vsn.
vsnMatrix(x, reference, strata, lts.quantile = 0.9, subsample = 0L, verbose = interactive(), returnData = TRUE, calib = "affine", pstart, minDataPointsPerStratum = 42L, optimpar = list(), defaultpar = list(factr=5e7, pgtol=2e-4, maxit=60000L, trace=0L, cvg.niter=7L, cvg.eps=0))
"vsn2"(x, reference, strata, ...)
"vsn2"(x, reference, strata, subsample, ...)
"vsn2"(x, reference, strata, backgroundsubtract=FALSE, foreground=c("R","G"), background=c("Rb", "Gb"), ...)
"vsn2"(x, reference, strata, ...)
vsn
object from
a previous fit. If this argument is specified, the data in x
are normalized "towards" an existing set of reference arrays whose
parameters are stored in the object reference
. If this
argument is not specified, then the data in x
are normalized
"among themselves". See Details for a more precise explanation.factor
or integer
whose length is nrow(x)
. It can
be used for stratified normalization (i.e. separate offsets $a$ and
factors $b$ for each level of strata
). If missing, all
rows of x
are assumed to come from one stratum.
If strata
is an integer, its values must cover the range
$1,\ldots,n$, where $n$ is the number of strata.subsample
only, yet the fitted transformation is
then applied to all data. For large datasets, this can substantially
reduce the CPU time and memory consumption at a negligible loss of
precision. Note that the AffyBatch
method of vsn2
sets a value of
30000
for this parameter if it is missing from the function
call - which is different from the behaviour of the other methods.x
that should be used
as foreground and background values.vsn
object.
Setting this option to FALSE
allows saving memory
if the data are not needed.affine
and none
. The default, affine
, corresponds to the
behaviour in package versions <= 3.9,="" and="" to="" what="" is="" described="" in="" references="" [1]="" [2].="" the="" option="" none is an experimental
new feature, in which no affine calibration is performed and only
two global variance stabilisation transformation parameters a
and b
are fitted. This functionality might be useful in
conjunction with other calibration methods, such as quantile
normalisation - see the vignette Introduction to vsn.=>
strata
,
the second dimension to the columns of x
and the third dimension
must be 2, corresponding to offsets and factors.defaultpar
. See details.optimpar
take precedence
over those in defaultpar
. The purpose of this argument is to
expose the default values in this manual page - it is not
intended to be changed, please use optimpar
for that.vsnMatrix
.vsn
.b[s,i]
of the array b is the scaling parameter for the s
-th
stratum and the i
-th array, then c[s]
is computed as
log2(2*f(mean(b[,i])))
.
The offset c is inconsequential for all differential
expression calculations, but many users like to see the data in a
range that they are familiar with.vsn2
methods exist for
ExpressionSet
,
NChannelSet
,
AffyBatch
(from the affy
package),
RGList
(from the limma
package),
matrix
and numeric
.
If x
is an NChannelSet
, then
vsn2
is applied to the matrix that is obtained
by horizontally concatenating the color channels.
Optionally, available background estimates can be subtracted before.
If x
is an RGList
, it is
converted into an NChannelSet
using a copy of Martin Morgan's code for RGList
to
NChannelSet
coercion, then the NChannelSet
method is called.reference
argument is not specified, then the model
parameters $\mu_k$ and $\sigma$ are fit from the data in x
.
This is the mode of operation described in [1]
and that was the only option in versions 1.X of this package.
If reference
is specified, the model parameters
$\mu_k$ and $\sigma$ are taken from it.
This allows for 'incremental' normalization [4].L-BFGS-B
uses three termination criteria:
(f_k - f_{k+1}) / max(|f_k|, |f_{k+1}|, 1) <= factr="" *="" epsmch<="" code="">
where epsmch
is the machine precision.
=>
|gradient| < pgtol
iterations > maxit
factr
, pgtol
and
maxit
of optimpar
. The remaining elements are
trace
L-BFGS-B
, higher values
create more output.cvg.niter
cvg.eps
[2] Parameter estimation for the calibration and variance stabilization of microarray data, Wolfgang Huber, Anja von Heydebreck, Holger Sueltmann, Annemarie Poustka, and Martin Vingron; Statistical Applications in Genetics and Molecular Biology (2003) Vol. 2 No. 1, Article 3. http://www.bepress.com/sagmb/vol2/iss1/art3.
[3] L-BFGS-B: Fortran Subroutines for Large-Scale Bound Constrained Optimization, C. Zhu, R.H. Byrd, P. Lu and J. Nocedal, Technical Report, Northwestern University (1996).
[4] Package vignette: Likelihood Calculations for vsn
justvsn
, predict
data("kidney")
fit = vsn2(kidney) ## fit
nkid = predict(fit, newdata=kidney) ## apply fit
plot(exprs(nkid), pch=".")
abline(a=0, b=1, col="red")
Run the code above in your browser using DataLab