Learn R Programming

Rdimtools (version 1.0.4)

do.procrustes: Feature Selection using PCA and Procrustes Analysis

Description

do.procrustes selects a set of features that best aligns PCA's coordinates in the embedded low dimension. It iteratively selects each variable that minimizes Procrustes distance between configurations.

Usage

do.procrustes(
  X,
  ndim = 2,
  intdim = (ndim - 1),
  cor = TRUE,
  preprocess = c("center", "scale", "cscale", "whiten", "decorrelate")
)

Arguments

X

an \((n\times p)\) matrix or data frame whose rows are observations and columns represent independent variables.

ndim

an integer-valued target dimension.

intdim

intrinsic dimension of PCA to be applied. It should be smaller than ndim.

cor

mode of eigendecomposition. FALSE for decomposing covariance, and TRUE for correlation matrix in PCA.

preprocess

an additional option for preprocessing the data. Default is "center". See also aux.preprocess for more details.

Value

a named list containing

Y

an \((n\times ndim)\) matrix whose rows are embedded observations.

featidx

a length-\(ndim\) vector of indices with highest scores.

trfinfo

a list containing information for out-of-sample prediction.

projection

a \((p\times ndim)\) whose columns are basis for projection.

References

krzanowski_selection_1987aRdimtools

Examples

Run this code
# NOT RUN {
## use iris data
## it is known that feature 3 and 4 are more important.
data(iris)
iris.dat = as.matrix(iris[,1:4])
iris.lab = as.factor(iris[,5])

## try different strategy
out1 = do.procrustes(iris.dat, cor=TRUE)
out2 = do.procrustes(iris.dat, cor=FALSE)
out3 = do.mifs(iris.dat, iris.lab, beta=0)

## visualize
opar <- par(no.readonly=TRUE)
par(mfrow=c(1, 3))
plot(out1$Y, pch=19, col=iris.lab, main="PCA with Covariance")
plot(out2$Y, pch=19, col=iris.lab, main="PCA with Correlation")
plot(out3$Y, pch=19, col=iris.lab, main="MIFS")
par(opar)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab