Learn R Programming

snm (version 1.20.0)

snm: Perform a supervised normalization of microarray data

Description

Implement Supervised Normalization of Microarrays on a gene expression matrix. Requires a set of biological covariates of interest and at least one probe-specific or intensity-dependent adjustment variable.

Usage

snm(raw.dat, bio.var=NULL, adj.var=NULL, int.var=NULL, weights=NULL, spline.dim = 4, num.iter = 10, lmer.max.iter=1000, nbins=20, rm.adj=FALSE, verbose=TRUE, diagnose=TRUE)

Arguments

raw.dat
An $m$ probes by $n$ arrays matrix of expression data. If the user wishes to remove intensity-dependent effects, then we request the matrix corresponds to the raw, log transformed data.
bio.var
A model matrix (see model.matrix) or data frame with $n$ rows of the biological variables. If NULL, then all probes are treated as "null" in the algorithm.
adj.var
A model matrix (see model.matrix) or data frame with $n$ rows of the probe-specific adjustment variables. If NULL, a model with an intercept term is used.
int.var
A data frame with $n$ rows of type factor with the unique levels of intensity-dependent effects. Each column parametrizes a unique source of intensity-dependent effect (e.g., array effects for column 1 and dye effects for column 2).
weights
A vector of length $m$. Values unchanged by algorithm, used to control the influence of each probe on the intensity-dependent array effects.
spline.dim
Dimension of basis spline used for array effects.
num.iter
Number of snm model fit iterations to run.
lmer.max.iter
Number of lmer iterations that are permitted. Set lmer.max.iter=NULL if no maximum is desired.
nbins
Number of bins used by binning strategy. Array effects are calculated from a $nbins$ x $n$ data matrix, where the $(i,j)$ value is equal to that bin $i$'s average intensity on array $j$.
rm.adj
If set to FALSE, then only the intensity dependent effects have been removed from the normalized data, implying the effects from the adjustment variables are still present. If TRUE, then the adjustment variables effects and the intensity dependent effects are both removed from the returned normalized data.
verbose
A flag telling the software whether or not to display a report after each iteration. TRUE produces the output.
diagnose
A flag telling the software whether or not to produce diagnostic output in the form of consecutive plots. TRUE produces the plot.

Value

norm.dat
The matrix of normalized data. The default setting is rm.adj=FALSE, which means that only the intensity-dependent effects have been subtracted from the data. If the user wants the adjustment variable effects removed as well, then set rm.adj=TRUE when calling the snm function.
pvalues
A vector of p-values testing the association of the biological variables with each probe. These p-values are obtained from an ANOVA comparing models where the full model contains both the probe-specific biological and adjustment variables versus a reduced model that just contains the probe-specific adjustment variables. The data used for this comparison has the intensity-dependent variables removed. These returned p-values are calculated after the final iteration of the algorithm.
pi0
The estimated proportion of true null probes $pi_0$, calculated after the final iteration of the algorithm.
iter.pi0s
A vector of length equal to num.iter containing the estimated $pi_0$ values at each iteration of the snm algorithm. These values should converge and any non-convergence suggests a problem with the data, the assumed model, or both
nulls
A vector indexing the probes utilized in estimating the intensity-dependent effects on the final iteration.
M
A matrix containing the estimated probe intensities for each array utilized in estimating the intensity-dependent effects on the final iteration. For memory parsimony, only a subset of values spanning the range is returned, currently nbins*100 values.
array.fx
A matrix of the final estimated intensity-dependent array effects. For memory parsimony, only a subset of values spanning the range is returned, currently nbins*100 values.
bio.var
The processed version of the same input variable.
adj.var
The processed version of the same input variable.
int.var
The processed version of the same input variable.
df0
Degrees of freedom of the adjustment variables.
df1
Degrees of freedom of the full model matrix, which includes the biological variables and the adjustment variables.
raw.dat
The input data.
rm.var
Same as the input (useful for later analyses).
call
Function call.

Details

This function implements the supervised normalization of microarrays algorithm described in Mecham, Nelson, and Storey (2010).

References

Mecham BH, Nelson PS, Storey JD (2010) Supervised normalization of microarrays. Bioinformatics, 26: 1308-1315.

See Also

model.matrix, plot.snm, fitted.snm, summary.snm, sim.singleChannel, sim.doubleChannel, sim.preProcessed, sim.refDesign

Examples

Run this code
singleChannel <- sim.singleChannel(12345)
snm.obj <- snm(singleChannel$raw.data,
		      singleChannel$bio.var,
		      singleChannel$adj.var,
		      singleChannel$int.var)
ks.test(snm.obj$pval[singleChannel$true.nulls],"punif")
plot(snm.obj)
summary(snm.obj)
snm.fit = fitted(snm.obj)

Run the code above in your browser using DataLab