tdmPrePCA.train: PCA (Principal Component Analysis) for numeric columns in a data frame.

Description

tdmPrePCA.train is capable of linear PCA, based on prcomp (which uses SVD), and of kernel PCA (either KPCA, KHA or KFA).

Usage

tdmPrePCA.train(dset, opts)

Arguments

dset

the data frame with training (and test) data.

opts

a list from which we need here the following entries:

PRE.PCA: ["linear" | "kernel" | "none" ]
PRE.knum: if >0 and if PRE.PCA="kernel", take only a subset of PRE.knum records from dset
PRE.PCA.REPLACE: [T] =T: replace the original numerical columns with the PCA columns; =F: add the PCA columns
PRE.PCA.npc: if >0, then add for the first PRE.PCA.npc PCs the monomials of degree 2 (see tdmPreAddMonomials)
PRE.PCA.numericV vector with all column names in dset for which PCA is performed. These columns may contain *numeric* values only.

Value

pca, a list with entries:

dset

the input data frame dset with columns numeric.variables replaced or extended (depending on opts$PRE.PCA.REPLACE) by the PCs with names PC1, PC2, ... (in case PRE.PCA=="linear") or with names KP1, KP2, ... (in case PRE.PCA=="kernel") and optional with monomial columns added, if PRE.PCA.npc>0. The number of PCs is min(nrows(dset),length(numeric.variables)).

numeric.variables

the new numeric column names (PCs, monomials, and optionally old numericV, if opts$PRE.PCA.REPLACE==F)

pcaList

a list with the items sdev, rotation, center, scale, x as returned from prcomp plus eigval, the eigenvalues for the PCs

Description

Usage

Arguments

Value

See Also