tdmPrePCA.train is capable of linear PCA, based on prcomp (which uses SVD), and of kernel PCA (either KPCA, KHA or KFA).
tdmPrePCA.train(dset, opts)
the data frame with training (and test) data.
a list from which we need here the following entries:
PRE.PCA: ["linear" | "kernel" | "none" ]
PRE.knum: if >0 and if PRE.PCA="kernel", take only a subset of PRE.knum records from dset
PRE.PCA.REPLACE: [T] =T: replace the original numerical columns with the PCA columns; =F: add the PCA columns
PRE.PCA.npc: if >0, then add for the first PRE.PCA.npc PCs the monomials of degree 2 (see tdmPreAddMonomials)
PRE.PCA.numericV vector with all column names in dset for which PCA is performed. These columns may contain *numeric* values only.
pca
, a list with entries:
the input data frame dset with columns numeric.variables replaced or extended (depending on opts$PRE.PCA.REPLACE
)
by the PCs with names PC1, PC2, ... (in case PRE.PCA=="linear")
or with names KP1, KP2, ... (in case PRE.PCA=="kernel")
and optional with monomial columns added, if PRE.PCA.npc>0.
The number of PCs is min(nrows(dset),length(numeric.variables)).
the new numeric column names (PCs, monomials, and optionally old numericV, if opts$PRE.PCA.REPLACE==F
)
a list with the items sdev, rotation, center, scale, x
as returned from prcomp
plus eigval
, the eigenvalues for the PCs