Learn R Programming

caret (version 4.47)

preProcess: Pre-Processing of Predictors

Description

Pre-processing transformation (centering, scaling etc) can be estimated from the training data and applied to any data set with the same variables.

Usage

preProcess(x, ...)

## S3 method for class 'default': preProcess(x, method = c("center", "scale"), thresh = 0.95, na.remove = TRUE, ...)

## S3 method for class 'preProcess': predict(object, newdata, ...)

Arguments

x
a matrix or data frame. All variables must be numeric.
method
a character vector specifying the type of processing. Possible values are "center", "scale", "pca" "ica" and "spartialSign" (see Details below)
thresh
a cutoff for the cumulative percent of variance to be retained by PCA
na.remove
a logical; should missing values be removed from the calculations?
object
an object of class preProcess
newdata
a matrix or data frame of new data to be pre-processed
...
additional arguments to pass to fastICA, such as n.comp

Value

  • preProcess results in a list with elements
  • callthe function call
  • dimthe dimensions of x
  • meana vector of means (if centering was requested)
  • stda vector of standard deviations (if scaling or PCA was requested)
  • rotationa matrix of eigenvectors if PCA was requested
  • methodthe value ofmethod
  • threshthe value ofthresh
  • numCompthe number of principal components required of capture the specified amount of variance
  • icacontains values for the W and K matrix of the decomposition

Details

The operations are applied in this order: centering, scaling, PCA, ICA then spatial sign.

If PCA is requested but scaling is not, the values will still be scaled. Similarly, when ICA is requested, the data are automatically centered.

A warning is thrown if both PCA and ICA are requested. ICA, as implemented bt the fastICA package automatically does a PCA decomposition prior to finding the ICA scores.

The function will throw an error of any variables in x has less than two unique values.

References

Kuhn (2008), ``Building Predictive Models in R Using the caret'' (http://www.jstatsoft.org/v28/i05/)

See Also

prcomp, fastICA, spatialSign

Examples

Run this code
data(BloodBrain)
# one variable has one unique value
preProc <- preProcess(bbbDescr[1:100,])

preProc <- preProcess(bbbDescr[1:100,-3])
training <- predict(preProc, bbbDescr[1:100,-3])
test <- predict(preProc, bbbDescr[101:208,-3])

Run the code above in your browser using DataLab