cv.irsvm_fit: Internal function of cross-validation for irsvm

Description

Internal function to conduct k-fold cross-validation for irsvm

Usage

cv.irsvm_fit(x, y, weights, cfun="ccave", s=c(1, 5), type=NULL, 
             kernel="radial", gamma=2^(-4:10), cost=2^(-4:4), 
             epsilon=0.1, balance=TRUE, nfolds=10, foldid, 
             trim_ratio=0.9, n.cores=2, ...)

Value

an object of class "cv.irsvm" is returned, which is a list with the ingredients of the cross-validation fit.

residmat: matrix with row values for kernel="linear" are s, cost, error, k, where k is the number of cross-validation fold. For nonlinear kernels, row values are s, gamma, cost, error, k.
cost: a value of cost that gives minimum cross-validated value in irsvm.
gamma: a value of gamma that gives minimum cross-validated value in irsvm
s: value of s for cfun that gives minimum cross-validated value in irsvm.

Arguments

x

a data matrix, a vector, or a sparse 'design matrix' (object of class Matrix provided by the Matrix package, or of class matrix.csr provided by the SparseM package, or of class simple_triplet_matrix provided by the slam package).

y

a response vector with one label for each row/component of x. Can be either a factor (for classification tasks) or a numeric vector (for regression).

weights

the weight of each subject. It should be in the same length of y.

cfun

character, type of convex cap (concave) function.
Valid options are:

"hcave"
"acave"
"bcave"
"ccave"
"dcave"
"ecave"
"gcave"
"tcave"

s

tuning parameter of cfun. s > 0 and can be equal to 0 for cfun="tcave". If s is too close to 0 for cfun="acave", "bcave", "ccave", the calculated weights can become 0 for all observations, thus crash the program.

type

irsvm can be used as a classification machine, or as a regression machine. Depending of whether y is a factor or not, the default setting for type is C-classification or eps-regression, respectively, but may be overwritten by setting an explicit value.
Valid options are:

C-classification
nu-classification
eps-regression
nu-regression

kernel, gamma

the kernel used in training and predicting. You might consider changing some of the following parameters, depending on the kernel type.

linear:: \(u'v\)

polynomial:

\((\gamma u'v + coef0)^{degree}\)

radial basis:

\(e^(-\gamma |u-v|^2)\)

sigmoid:

\(tanh(\gamma u'v + coef0)\)

cost

cost of constraints violation (default: 1)---it is the ‘C’-constant of the regularization term in the Lagrange formulation. This is proportional to the inverse of lambda in irglmreg.

epsilon

epsilon in the insensitive-loss function (default: 0.1)

balance

for type="C-classification", "nu-classification" only

nfolds

number of folds >=3, default is 10

foldid

an optional vector of values between 1 and nfold identifying what fold each observation is in. If supplied, nfold can be missing and will be ignored.

trim_ratio

a number between 0 and 1 for trimmed least squares, useful if type="eps-regression" or "nu-regression".

n.cores

The number of CPU cores to use. The cross-validation loop will attempt to send different CV folds off to different cores.

...

Other arguments that can be passed to irsvm.

Author

Zhu Wang <zwang145@uthsc.edu>

Details

This function is the driving force behind cv.irsvm. Does a K-fold cross-validation to determine optimal tuning parameters in SVM: cost and gamma if kernel is nonlinear. It can also choose s used in cfun.

References

Zhu Wang (2024) Unified Robust Estimation, Australian & New Zealand Journal of Statistics. 66(1):77-102.