Principal Variance Component Analysis (PVCA) (Li et al, 2009) allows the estimation of the contribution of several sources of variability. pvcam
uses it to estimate the proportion of variance in the data explained by the class signal. See below for a more detailed explanation of what the function does.
pvcam(xba, batch, y, threshold = 0.6)
matrix. The covariate matrix, raw or after batch effect adjustment. observations in rows, variables in columns.
factor. Batch variable. Each factor level (or 'category') corresponds to one of the batches. For example, if there are four batches, this variable would have four factor levels and observations with the same factor level would belong to the same batch.
factor. Binary target variable. Has to have two factor levels, where each of them correponds to one of the two classes of the target variable.
numeric. Minimal proportion of variance explained by the principal components used.
Value of the metric
In PVCA, first principal component analysis is performed on the n x n covariance matrix between the
observations. Then, using a random effects model
the principal components are regressed on arbitrary factors of variability, such
as "batch" and "(phenotype) class". Ultimately, estimated proportions of variance induced by each factor and that of the residual variance are obtained.
In pvcam
the factors included into the model are: "batch", "class" and the interaction of these two into.
The metric calculated by pvcam
is the proportion of variance
explained by "class".
pvcam
uses a slightly altered version of the function pvcaBatchAssess()
from the Bioconductor package pvca
.
The latter was altered to take the covariate data as a matrix
instead of as an object of class ExpressionSet
.
Li, J., Bushel, P., Chu, T.-M., Wolfinger, R.D. (2009). Principal variance components analysis: Estimating batch effects in microarray gene expression data. In: Scherer, A. (ed) Batch Effects and Noise in Microarray Experiments: Sources and Solutions, John Wiley & Sons, Chichester, UK, <10.1002/9780470685983.ch12>.
# NOT RUN {
data(autism)
Xadj <- ba(x=X, y=y, batch=batch, method = "combat")$xadj
pvcam(xba = X, batch = batch, y = y)
pvcam(xba = Xadj, batch = batch, y = y)
# }
Run the code above in your browser using DataLab