Learn R Programming

msgl (version 0.1.3)

msgl.cv: Multinomial sparse group lasso cross validation using multiple possessors

Description

Multinomial sparse group lasso cross validation using multiple possessors

Usage

msgl.cv(x, classes, sampleWeights = NULL,
    grouping = NULL, groupWeights = NULL,
    parameterWeights = NULL, alpha = 0.5,
    standardize = TRUE, lambda, fold = 10L,
    cv.indices = list(), sparse.data = FALSE,
    max.threads = 2L, seed = 331L,
    algorithm.config = sgl.standard.config)

Arguments

x
design matrix, matrix of size $N \times p$.
classes
classes, factor of length $N$.
sampleWeights
sample weights, a vector of length $N$.
grouping
grouping of covariates, a vector of length $p$. Each element of the vector specifying the group of the covariate.
groupWeights
the group weights, a vector of length $m+1$ (the number of groups). The first element of the vector is the intercept weight. If groupWeights = NULL default weights will be used. Default weights are 0 for the intercept and $$\sqrt{K\cd
parameterWeights
a matrix of size $K \times (p+1)$. The first column of the matrix is the intercept weights. Default weights are is 0 for the intercept weights and 1 for all other weights.
alpha
the $\alpha$ value 0 for group lasso, 1 for lasso, between 0 and 1 gives a sparse group lasso penalty.
standardize
if TRUE the covariates are standardize before fitting the model. The model parameters are returned in the original scale.
lambda
the lambda sequence for the regularization path.
fold
the fold of the cross validation, an integer larger than $1$ and less than $N+1$. Ignored if cv.indices != NULL. If fold$\le$max(table(classes)) then the data will be split into fold disjoint sub
cv.indices
a list of indices of a cross validation splitting. If cv.indices = NULL then a random splitting will be generated using the fold argument.
sparse.data
if TRUE x will be treated as sparse, if x is a sparse matrix it will be treated as sparse by default.
max.threads
the maximal number of threads to be used
seed
the seed used for generating the random cross validation splitting, only used if fold$\le$max(table(classes)).
algorithm.config
the algorithm configuration to be used.

Value

  • linkthe linear predictors -- a list of length length(lambda) one item for each lambda value, with each item a matrix of size $K \times N$ containing the linear predictors.
  • responsethe estimated probabilities - a list of length length(lambda) one item for each lambda value, with each item a matrix of size $K \times N$ containing the probabilities.
  • classesthe estimated classes - a matrix of size $N \times d$ with $d=$length(lambda).
  • cv.indicesthe cross validation splitting used.
  • featuresaverage number of features used in the models.
  • parametersaverage number of parameters used in the models.

Examples

Run this code
data(SimData)
x <- sim.data$x
classes <- sim.data$classes
lambda <- msgl.lambda.seq(x, classes, alpha = .5, d = 25L, lambda.min = 0.03)
fit.cv <- msgl.cv(x, classes, alpha = .5, lambda = lambda)

# Missclassification count
colSums(fit.cv$classes != classes)

Run the code above in your browser using DataLab