This function computes the optimal ridge regression model based on cross-validation.
ridge.cv(
X,
y,
lambda = NULL,
scale = TRUE,
k = 10,
plot.it = FALSE,
groups = NULL,
method.cor = "pearson",
compute.jackknife = TRUE
)
matrix of cross-validated errors based on mean squared error. A row corresponds to one cross-validation split.
vector of cross-validated errors based on mean squared error
optimal value of lambda
, based on mean
squared error
intercept of the optimal model, based on mean squared error
vector of regression coefficients of the optimal model, based on mean squared error
matrix of cross-validated errors based on correlation. A row corresponds to one cross-validation split.
vector of cross-validated errors based on correlation
optimal value of lambda
, based on correlation
intercept of the optimal model, based on correlation
vector of regression coefficients of the optimal model, based on mean squared error
Array of
the regression coefficients on each of the cross-validation splits. The
dimension is ncol(X) x length(lambda) x k
.
matrix of input observations. The rows of X
contain the
samples, the columns of X
contain the observed variables
vector of responses. The length of y must equal the number of rows of X
Vector of penalty terms.
Scale the columns of X? Default is scale=TRUE.
Number of splits in k
-fold cross-validation. Default value
is k
=10.
Plot the cross-validation error as a function of
lambda
? Default is FALSE.
an optional vector with the same length as y
. It
encodes a partitioning of the data into distinct subgroups. If groups
is provided, k=10
is ignored and instead, cross-validation is
performed based on the partioning. Default is NULL
.
How should the correlation to the response be computed? Default is ''pearson''.
Logical. If TRUE
, the regression
coefficients on each of the cross-validation splits is stored. Default is
TRUE
.
Nicole Kraemer
Based on the regression coefficients coefficients.jackknife
computed
on the cross-validation splits, we can estimate their mean and their
variance using the jackknife. We remark that under a fixed design and the
assumption of normally distributed y
-values, we can also derive the
true distribution of the regression coefficients.
pls.cv
, pcr.cv
,
benchmark.regression
n<-100 # number of observations
p<-60 # number of variables
X<-matrix(rnorm(n*p),ncol=p)
y<-rnorm(n)
ridge.object<-ridge.cv(X,y)
Run the code above in your browser using DataLab