Learn R Programming

KERE (version 1.0.0)

cv.KERE: Cross-validation for KERE

Description

Does k-fold cross-validation for KERE, produces a plot, and returns a value for lambda.

Usage

"cv"(x, y, kern, lambda = NULL, nfolds = 5, foldid, omega = 0.5, ...)

Arguments

x
matrix of predictors, of dimension $N*p$; each row is an observation vector.
y
response variable.
kern
the built-in kernel classes in KERE. The kern parameter can be set to any function, of class kernel, which computes the inner product in feature space between two vector arguments. KERE provides the most popular kernel functions which can be initialized by using the following functions:
  • rbfdot Radial Basis kernel function,
  • polydot Polynomial kernel function,
  • vanilladot Linear kernel function,
  • tanhdot Hyperbolic tangent kernel function,
  • laplacedot Laplacian kernel function,
  • besseldot Bessel kernel function,
  • anovadot ANOVA RBF kernel function,
  • splinedot the Spline kernel.

Objects can be created by calling the rbfdot, polydot, tanhdot, vanilladot, anovadot, besseldot, laplacedot, splinedot functions etc. (see example.)

lambda
a user supplied lambda sequence. It is better to supply a decreasing sequence of lambda values, if not, the program will sort user-defined lambda sequence in decreasing order automatically.
nfolds
number of folds - default is 5. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3.
foldid
an optional vector of values between 1 and nfold identifying what fold each observation is in. If supplied, nfold can be missing.
omega
the parameter $omega$ in the expectile regression model. The value must be in (0,1). Default is 0.5.
...
other arguments that can be passed to KERE.

Value

an object of class cv.KERE is returned, which is a list with the ingredients of the cross-validation fit.
lambda
the values of lambda used in the fits.
cvm
the mean cross-validated error - a vector of length length(lambda).
cvsd
estimate of standard error of cvm.
cvupper
upper curve = cvm+cvsd.
cvlo
lower curve = cvm-cvsd.
name
a character string "Expectile Loss"
lambda.min
the optimal value of lambda that gives minimum cross validation error cvm.
cvm.min
the minimum cross validation error cvm.

Details

The function runs KERE nfolds+1 times; the first to get the lambda sequence, and then the remainder to compute the fit with each of the folds omitted. The average error and standard deviation over the folds are computed.

References

Y. Yang, T. Zhang, and H. Zou. "Flexible Expectile Regression in Reproducing Kernel Hilbert Space." ArXiv e-prints: stat.ME/1508.05987, August 2015.

Examples

Run this code

N <- 200
X1 <- runif(N)
X2 <- 2*runif(N)
X3 <- 3*runif(N)
SNR <- 10 # signal-to-noise ratio
Y <- X1**1.5 + 2 * (X2**.5) + X1*X3
sigma <- sqrt(var(Y)/SNR)
Y <- Y + X2*rnorm(N,0,sigma)
X <- cbind(X1,X2,X3)

# set gaussian kernel 
kern <- rbfdot(sigma=0.1)

# define lambda sequence
lambda <- exp(seq(log(0.5),log(0.01),len=10))

cv.KERE(x=X, y=Y, kern, lambda = lambda, nfolds = 5, omega = 0.5)

Run the code above in your browser using DataLab