Internal function to conduct k-fold cross-validation for irsvm
cv.irsvm_fit(x, y, weights, cfun="ccave", s=c(1, 5), type=NULL,
kernel="radial", gamma=2^(-4:10), cost=2^(-4:4),
epsilon=0.1, balance=TRUE, nfolds=10, foldid,
trim_ratio=0.9, n.cores=2, ...)
an object of class "cv.irsvm"
is returned, which is a
list with the ingredients of the cross-validation fit.
matrix with row values for kernel="linear"
are s, cost, error, k
, where k
is the number of cross-validation fold. For nonlinear kernels, row values are s, gamma, cost, error, k
.
a value of cost
that gives minimum cross-validated value in irsvm
.
a value of gamma
that gives minimum cross-validated value in irsvm
value of s
for cfun
that gives minimum cross-validated value in irsvm
.
a data matrix, a vector, or a sparse 'design matrix' (object of class
Matrix
provided by the Matrix package,
or of class matrix.csr
provided by the SparseM package, or of class
simple_triplet_matrix
provided by the slam
package).
a response vector with one label for each row/component of
x
. Can be either a factor (for classification tasks)
or a numeric vector (for regression).
the weight of each subject. It should be in the same length of y
.
character, type of convex cap (concave) function.
Valid options are:
"hcave"
"acave"
"bcave"
"ccave"
"dcave"
"ecave"
"gcave"
"tcave"
tuning parameter of cfun
. s > 0
and can be equal to 0 for cfun="tcave"
. If s
is too close to 0 for cfun="acave", "bcave", "ccave"
, the calculated weights can become 0 for all observations, thus crash the program.
irsvm
can be used as a classification
machine, or as a regression machine.
Depending of whether y
is
a factor or not, the default setting for type
is C-classification
or eps-regression
, respectively, but may be overwritten by setting an explicit value.
Valid options are:
C-classification
nu-classification
eps-regression
nu-regression
the kernel used in training and predicting. You
might consider changing some of the following parameters, depending
on the kernel type.
\(u'v\)
\((\gamma u'v + coef0)^{degree}\)
\(e^(-\gamma |u-v|^2)\)
\(tanh(\gamma u'v + coef0)\)
cost of constraints violation (default: 1)---it is the
‘C’-constant of the regularization term in the Lagrange formulation. This is proportional to the inverse of lambda
in irglmreg
.
epsilon in the insensitive-loss function (default: 0.1)
for type="C-classification", "nu-classification"
only
number of folds >=3, default is 10
an optional vector of values between 1 and nfold
identifying what fold each observation is in. If supplied,
nfold
can be missing and will be ignored.
a number between 0 and 1 for trimmed least squares, useful if type="eps-regression"
or "nu-regression"
.
The number of CPU cores to use. The cross-validation loop will attempt to send different CV folds off to different cores.
Other arguments that can be passed to irsvm
.
Zhu Wang <zwang145@uthsc.edu>
This function is the driving force behind cv.irsvm
. Does a K-fold cross-validation to determine optimal tuning parameters in SVM: cost
and gamma
if kernel
is nonlinear. It can also choose s
used in cfun
.
Zhu Wang (2024) Unified Robust Estimation, Australian & New Zealand Journal of Statistics. 66(1):77-102.
cv.irsvm
and irsvm