Learn R Programming

rdetools (version 1.0)

rde_loocv: Relevant Dimension Estimation (RDE) by Leave-One-Out Cross-Validation (LOO-CV)

Description

The function estimates the relevant dimension in feature space by leave-one-out cross-validation. It's also able to calculate a denoised version of the labels and to estimate the noise level in the data set.

Usage

rde_loocv(K, y, est_y = FALSE, alldim = FALSE, est_noise = FALSE, regression = FALSE, nmse = TRUE, dim_rest = 0.5)

Arguments

K
kernel matrix of the inputs (e.g. rbf kernel matrix)
y
label vector which contains the label for each data point
est_y
set this to TRUE if you want a denoised version of the labels
alldim
if this is TRUE denoised labels for all dimensions are calculated (instead of only for relevant dimension)
est_noise
set this to TRUE if you want an estimated noise level
regression
only interesting if one of est_y, alldim, est_noise is TRUE. Set this to TRUE if you want to force the function to handle the data as data for a regression problem. If you leave this FALSE, the function will try to determine itself whether this is a classification or regression problem.
nmse
only interesting if est_noise is TRUE and the function is handling the data as data of a regression problem. If you leave this TRUE, the normalized mean squared error is used for estimating the noise level, otherwise the conventional mean squared error.
dim_rest
percantage of leading dimensions to which the search for the relevant dimensions should be restricted. This is needed due to numerical instabilities. 0.5 should be a good choice in most cases (and is also the default value)

Value

rd
estimated relevant dimension
err
loo-cv error for each dimension (the position of the minimum is the relevant dimension)
yh
only returned if est_y, alldim or est_noise is TRUE, contains the denoised labels
Yh
only returned if alldim is TRUE, matrix with denoised labels for each dimension in each column
noise
only returned if est_noise is TRUE, contains the estimated noise level
kpc
kernel pca coefficients
eigvec
eigenvectors of the kernel matrix
eigval
eigenvalues of the kernel matrix
tcm
always FALSE; used to tell other functions that loo-cv method was used

Details

If est_noise or alldim are TRUE, a denoised version of the labels for the relevant dimension will be returned even if est_y is FALSE (so e.g. if you want denoised labels and noise approximation it is enough to set est_noise to TRUE).

References

M. L. Braun, J. M. Buhmann, K. R. Mueller (2008) \_On Relevant Dimensions in Kernel Feature Spaces\_

See Also

rde, rde_tcm, estnoise, isregression, rbfkernel, polykernel, drawkpc

Examples

Run this code
## example with sinc data
d <- sincdata(100, 0.1) # generate sinc data
K <- rbfkernel(d$X) # calculate rbf kernel matrix
# rde, return also denoised labels and noise
r <- rde_loocv(K, d$y, est_y = TRUE, est_noise = TRUE)
r$rd # estimated relevant dimension
r$noise # estimated noise
drawkpc(r) # draw kernel pca coefficients

Run the code above in your browser using DataLab