train
can be used to tune models by picking the complexity parameters that are associated with the optimal resampling statistics. For particular model, a grid of parameters (if any) is created and the model is trained on slightly different data for each candidate combination of tuning parameters. Across each data set, the performance of held-out samples is calculated and the mean and standard deviation is summarized for each combination. The combination with the optimal resampling statistic is chosen as the final model and the entire training set is used to fit a final model.A variety of models are currently available. The table below enumerates the models and the values of the method
argument, as well as the complexity parameters used by train
.
lccc{
Model method
Value Package Tuning Parameter(s)
Generalized linear model glm
stats none
glmStepAIC
MASS none
Generalized additive model gam
mgcv select
, method
gamLoess
gam span
, degree
gamSpline
gam df
Recursive partitioning rpart
rpart cp
rpart2
rpart maxdepth
ctree
party mincriterion
ctree2
party maxdepth
Boosted trees gbm
gbm interaction depth
,
n.trees
, shrinkage
blackboost
mboost maxdepth
, mstop
ada
ada maxdepth
, iter
, nu
bstTree
bst maxdepth
, mstop
, nu
Boosted regression models glmboost
mboost mstop
gamboost
mboost mstop
logitBoost
caTools nIter
bstLs
bst mstop
, nu
bstSm
bst mstop
, nu
Random forests rf
randomForest mtry
parRF
randomForest, foreach mtry
cforest
party mtry
Boruta
Boruta mtry
Bagging treebag
ipred None
bag
caret vars
logicBag
logicFS ntrees
, nleaves
Other Trees nodeHarvest
nodeHarvest maxinter
, node
partDSA
partDSA cut.off.growth
, MPD
Logic Regression logreg
LogicReg ntrees
, treesize
Elastic net (glm) glmnet
glmnet alpha
, lambda
Neural networks nnet
nnet decay
, size
neuralnet
neuralnet layer1
, layer2
, layer3
pcaNNet
caret decay
, size
avNNet
caret decay
, size
, bag
Projection pursuit regression ppr
stats nterms
Principal component regression pcr
pls ncomp
Independent component regression icr
caret n.comp
Partial least squares pls
pls, caret ncomp
simpls
pls, caret ncomp
widekernelpls
pls, caret ncomp
Sparse partial least squares spls
spls, caret K
, eta
, kappa
Support vector machines svmLinear
kernlab C
svmRadial
kernlab sigma
, C
svmRadialCost
kernlab C
svmPoly
kernlab scale
, degree
, C
Relevance vector machines rvmLinear
kernlab none
rvmRadial
kernlab sigma
rvmPoly
kernlab scale
, degree
Least squares support vector machines lssvmRadial
kernlab sigma
Gaussian processes guassprLinearl
kernlab none
guassprRadial
kernlab sigma
guassprPoly
kernlab scale
, degree
Linear least squares lm
stats None
lmStepAIC
MASS None
leapForward
leaps nvmax
leapBackward
leaps nvmax
leapSeq
leaps nvmax
Robust linear regression rlm
MASS None
Multivariate adaptive regression splines earth
earth degree
, nprune
gcvEarth
earth degree
Bagged MARS bagEarth
caret, earth degree
, nprune
Rule Based Regression M5Rules
RWeka pruned
, smoothed
M5
RWeka pruned
, smoothed
, rules
cubist
Cubist committees
, neighbors
Penalized linear models penalized
penalized lambda1
, lambda2
ridge
elasticnet lambda
enet
elasticnet lambda
, fraction
lars
lars fraction
lars2
lars steps
enet
elasticnet fraction
foba
foba lambda
, k
Supervised principal components superpc
superpc n.components
, threshold
Quantile regression forests qrf
quantregForest mtry
Quantile regression neural networks qrnn
qrnn n.hidden
, penalty
, bag
Linear discriminant analysis lda
MASS None
Linda
rrcov None
Quadratic discriminant analysis qda
MASS None
QdaCov
rrcov None
Stabilized linear discriminant analysis slda
ipred None
Heteroscedastic discriminant analysis hda
hda newdim
, lambda
, gamma
Stepwise discriminant analysis stepLDA
klaR maxvar
, direction
stepQDA
klaR maxvar
, direction
Stepwise diagonal discriminant analysis sddaLDA
SDDA None
sddaQDA
SDDA None
Shrinkage discriminant analysis sda
sda diagonal
Sparse linear discriminant analysis sparseLDA
sparseLDA NumVars
, lambda
Regularized discriminant analysis rda
klaR lambda
, gamma
Mixture discriminant analysis mda
mda subclasses
Sparse mixture discriminant analysis smda
sparseLDA NumVars
, R
, lambda
Penalized discriminant analysis pda
mda lambda
pda2
mda df
Stabilised linear discriminant analysis slda
ipred None
High dimensional discriminant analysis hdda
HDclassif model
, threshold
Flexible discriminant analysis (MARS) fda
mda, earth degree
, nprune
Robust Regularized Linear Discriminant Analysis rrlda
rrlda lambda
, alpha
Bagged FDA bagFDA
caret, earth degree
, nprune
Logistic/multinomial regression multinom
nnet decay
Penalized logistic regression plr
stepPlr lambda
, cp
Rule--based classification J48
RWeka C
OneR
RWeka None
PART
RWeka threshold
, pruned
JRip
RWeka NumOpt
Logic Forests logforest
LogicForest None
Bayesian multinomial probit model vbmpRadial
vbmp estimateTheta
k nearest neighbors knn3
caret k
Nearest shrunken centroids pam
pamr threshold
scrda
rda alpha
, delta
Naive Bayes nb
klaR usekernel
, fL
Generalized partial least squares gpls
gpls K.prov
Learned vector quantization lvq
class size
, k
ROC Curves rocc
rocc
xgenes
}
By default, the function createGrid
is used to define the candidate values of the tuning parameters. The user can also specify their own. To do this, a data fame is created with columns for each tuning parameter in the model. The column names must be the same as those listed in the table above with a leading dot. For example, ncomp
would have the column heading .ncomp
. This data frame can then be passed to createGrid
.
In some cases, models may require control arguments. These can be passed via the three dots argument. Note that some models can specify tuning parameters in the control objects. If specified, these values will be superseded by those given in the createGrid
argument.
The vignette entitled "caret Manual -- Model Building" has more details and examples related to this function.
train
can be used with "explicit parallelism", where different resamples (e.g. cross-validation group) and models can be split up and run on multiple machines or processors. By default, train
will use a single processor on the host machine. As of version 4.99 of this package, the framework used for parallel processing uses the foreach package. To run the resamples in parallel, the code for train
does not change; prior to the call to train
, a parallel backend is registered with foreach (see the examples below).