Learn R Programming

rrcov (version 1.7-2)

PcaGrid: Robust Principal Components based on Projection Pursuit (PP): GRID search Algorithm

Description

Computes an approximation of the PP-estimators for PCA using the grid search algorithm in the plane.

Usage

PcaGrid(x, ...)
    # S3 method for default
PcaGrid(x, k = 0, kmax = ncol(x), 
        scale=FALSE, na.action = na.fail, crit.pca.distances = 0.975, trace=FALSE, ...)
    # S3 method for formula
PcaGrid(formula, data = NULL, subset, na.action, ...)

Value

An S4 object of class PcaGrid-class which is a subclass of the virtual class PcaRobust-class.

Arguments

formula

a formula with no response variable, referring only to numeric variables.

data

an optional data frame (or similar: see model.frame) containing the variables in the formula formula.

subset

an optional vector used to select rows (observations) of the data matrix x.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The default is na.omit.

...

arguments passed to or from other methods.

x

a numeric matrix (or data frame) which provides the data for the principal components analysis.

k

number of principal components to compute. If k is missing, or k = 0, it is set to the number of columns of the data. It is preferable to investigate the scree plot in order to choose the number of components and then run again. Default is k=0.

kmax

maximal number of principal components to compute. Default is kmax=10. If k is provided, kmax does not need to be specified, unless k is larger than 10.

scale

a value indicating whether and how the variables should be scaled. If scale = FALSE (default) or scale = NULL no scaling is performed (a vector of 1s is returned in the scale slot). If scale = TRUE the data are scaled to have unit variance. Alternatively it can be a function like sd or mad or a vector of length equal the number of columns of x. The value is passed to the underlying function and the result returned is stored in the scale slot. Default is scale = FALSE

crit.pca.distances

criterion to use for computing the cutoff values for the orthogonal and score distances. Default is 0.975.

trace

whether to print intermediate results. Default is trace = FALSE

Author

Valentin Todorov valentin.todorov@chello.at

Details

PcaGrid, serving as a constructor for objects of class PcaGrid-class is a generic function with "formula" and "default" methods. For details see PCAgrid and the relevant references.

References

C. Croux, P. Filzmoser, M. Oliveira, (2007). Algorithms for Projection-Pursuit Robust Principal Component Analysis, Chemometrics and Intelligent Laboratory Systems, 87, 225.

Todorov V & Filzmoser P (2009), An Object Oriented Framework for Robust Multivariate Analysis. Journal of Statistical Software, 32(3), 1--47. tools:::Rd_expr_doi("10.18637/jss.v032.i03").

Examples

Run this code
    # multivariate data with outliers
    library(mvtnorm)
    x <- rbind(rmvnorm(200, rep(0, 6), diag(c(5, rep(1,5)))),
                rmvnorm( 15, c(0, rep(20, 5)), diag(rep(1, 6))))
    # Here we calculate the principal components with PCAgrid
    pc <- PcaGrid(x, 6)
    # we could draw a biplot too:
    biplot(pc)
    
    # we could use another objective function, and 
    # maybe only calculate the first three principal components:
    pc <- PcaGrid(x, 3, method="qn")
    biplot(pc)
    
    # now we want to compare the results with the non-robust principal components
    pc <- PcaClassic(x, k=3)
    # again, a biplot for comparision:
    biplot(pc)

Run the code above in your browser using DataLab