mahalanobisQC: Using Mahalanobis Distance and PCA for Quality Control
Description
Compute the Mahalanobis distance of each sample from the center of an
N-dimensional principal component space.
Usage
mahalanobisQC(spca, N)
Arguments
spca
object of class SamplePCA representing the
results of a principal components analysis.
N
integer scalar specifying the number of components to use when
assessing QC.
Value
Returns a data frame containing two columns, with the rows
corresponding to the columns of the original data set on which PCA was
performed. First column is the chi-squared statistic, with N
degrees of freedom. Second column is the associated p-value.
Details
The theory says that, under the null hypothesis that all samples arise
from the same multivariate normal distribution, the distance from the
center of a D-dimensional principal component space should follow a
chi-squared distribution with D degrees of freedom. This theory lets
us compute p-values associated with the Mahalanobis distances for
each sample. This method can be used for quality control or outlier
identification.
References
Coombes KR, et al.
Quality control and peak finding for proteomics data collected from
nipple aspirate fluid by surface-enhanced laser desorption and ionization.
Clin Chem 2003; 49:1615-23.