Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.
The following set of model fitting, prediction, and performance assessment functions are available for MachineShop models.
Training:
fit |
Model fitting |
resample |
Resample estimation of model performance |
Tuning Grids:
expand_model |
Model expansion over tuning parameters |
expand_modelgrid |
Model tuning grid expansion |
expand_params |
Model parameters expansion |
expand_steps |
Recipe step parameters expansion |
Response Values:
response |
Observed |
predict |
Predicted |
Performance Assessment:
calibration |
Model calibration |
confusion |
Confusion matrix |
dependence |
Parital dependence |
diff |
Model performance differences |
lift |
Lift curves |
performance metrics |
Model performance metrics |
performance_curve |
Model performance curves |
varimp |
Variable importance |
Methods for resample estimation include
BootControl |
Simple bootstrap |
BootOptimismControl |
Optimism-corrected bootstrap |
CVControl |
Repeated K-fold cross-validation |
CVOptimismControl |
Optimism-corrected cross-validation |
OOBControl |
Out-of-bootstrap |
SplitControl |
Split training-testing |
TrainControl |
Training resubstitution |
Graphical and tabular summaries of modeling results can be obtained with
plot |
print |
summary |
Further information on package features is available with
metricinfo |
Performance metric information |
modelinfo |
Model information |
settings |
Global settings |
Custom metrics and models can be created with the MLMetric
and
MLModel
constructors.
Useful links: