Learn R Programming

MachineShop: Machine Learning Models and Tools for R

Description

MachineShop is a meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Support is provided for predictive modeling of numerical, categorical, and censored time-to-event outcomes and for resample (bootstrap, cross-validation, and split training-test sets) estimation of model performance. This vignette introduces the package interface with a survival data analysis example, followed by supported methods of variable specification; applications to other response variable types; available performance metrics, resampling techniques, and graphical and tabular summaries; and modeling strategies.

Features

  • Unified and concise interface for model fitting, prediction, and performance assessment.
  • Support for 53+ models from 28 R packages, including model specifications from the parsnip package.
  • Dynamic model parameters.
  • Ensemble modeling with stacked regression and super learners.
  • Modeling of response variables types: binary factors, multi-class nominal and ordinal factors, numeric vectors and matrices, and censored time-to-event survival.
  • Model specification with traditional formulas, design matrices, and flexible pre-processing recipes.
  • Resample estimation of predictive performance, including cross-validation, bootstrap resampling, and split training-test set validation.
  • Parallel execution of resampling algorithms.
  • Choices of performance metrics: accuracy, areas under ROC and precision recall curves, Brier score, coefficient of determination (R2), concordance index, cross entropy, F score, Gini coefficient, unweighted and weighted Cohen’s kappa, mean absolute error, mean squared error, mean squared log error, positive and negative predictive values, precision and recall, and sensitivity and specificity.
  • Graphical and tabular performance summaries: calibration curves, confusion matrices, partial dependence plots, performance curves, lift curves, and model-specific and permutation-based variable importance.
  • Model tuning over automatically generated grids and with exhaustive and random grid searches, Bayesian optimization, particle swarm optimization, quasi-Newton BFGS optimization, simulated annealing, and support for user-defined optimization functions.
  • Model selection and comparisons for any combination of models and model parameter values.
  • Recursive feature elimination.
  • User-definable models and performance metrics.

Getting Started

Installation

# Current release from CRAN
install.packages("MachineShop")

# Development version from GitHub
# install.packages("devtools")
devtools::install_github("brian-j-smith/MachineShop")

# Development version with vignettes
devtools::install_github("brian-j-smith/MachineShop", build_vignettes = TRUE)

Documentation

Once installed, the following R commands will load the package and display its help system documentation. Online documentation and examples are available at the MachineShop website.

library(MachineShop)

# Package help summary
?MachineShop

# Vignette
RShowDoc("UserGuide", package = "MachineShop")

Copy Link

Version

Install

install.packages('MachineShop')

Monthly Downloads

839

Version

3.8.0

License

GPL-3

Maintainer

Last Published

August 19th, 2024

Functions in MachineShop (3.8.0)

LARSModel

Least Angle Regression, Lasso and Infinitesimal Forward Stagewise Models
GAMBoostModel

Gradient Boosting with Additive Models
GLMNetModel

GLM Lasso or Elasticnet Model
FDAModel

Flexible and Penalized Discriminant Analysis Models
GLMModel

Generalized Linear Model
GLMBoostModel

Gradient Boosting with Linear Models
LDAModel

Linear Discriminant Analysis Model
GBMModel

Generalized Boosted Regression Model
ICHomes

Iowa City Home Sales Dataset
KNNModel

Weighted k-Nearest Neighbor Model
LMModel

Linear Models
NNetModel

Neural Network Model
MLControl

Resampling Controls
ModelFrame

ModelFrame Class
MLMetric

MLMetric Class Constructor
MDAModel

Mixture Discriminant Analysis Model
MachineShop-package

MachineShop: Machine Learning Models and Tools
MLModel

MLModel and MLModelFunction Class Constructors
ParameterGrid

Tuning Parameters Grid
ModelSpecification

Model Specification
PLSModel

Partial Least Squares Model
RangerModel

Fast Random Forest Model
SVMModel

Support Vector Machine Models
ParsnipModel

Parsnip Model
SuperModel

Super Learner Model
RandomForestModel

Random Forest Model
RPartModel

Recursive Partitioning and Regression Tree Models
StackedModel

Stacked Regression Model
TunedModel

Tuned Model
QDAModel

Quadratic Discriminant Analysis Model
SurvRegModel

Parametric Survival Model
TunedInput

Tuned Model Inputs
as.MLModel

Coerce to an MLModel
POLRModel

Ordered Logistic or Probit Regression Model
NaiveBayesModel

Naive Bayes Classifier Model
diff

Model Performance Differences
SurvMatrix

SurvMatrix Class Constructors
RFSRCModel

Fast Random Forest (SRC) Model
TuningGrid

Tuning Grid Control
as.data.frame

Coerce to a Data Frame
TreeModel

Classification and Regression Tree Models
confusion

Confusion Matrix
dependence

Partial Dependence
XGBModel

Extreme Gradient Boosting Models
combine

Combine MachineShop Objects
fit

Model Fitting
as.MLInput

Coerce to an MLInput
extract

Extract Elements of an Object
expand_model

Model Expansion Over Tuning Parameters
calibration

Model Calibration
expand_params

Model Parameters Expansion
metricinfo

Display Performance Metric Information
expand_steps

Recipe Step Parameters Expansion
print

Print MachineShop Objects
recipe_roles

Set Recipe Roles
case_weights

Extract Case Weights
reexports

Objects exported from other packages
modelinfo

Display Model Information
quote

Quote Operator
predict

Model Prediction
models

Models
metrics

Performance Metrics
plot

Model Performance Plots
rfe

Recursive Feature Elimination
step_kmeans

K-Means Clustering Variable Reduction
step_kmedoids

K-Medoids Clustering Variable Selection
set_strata

Resampling Stratification Control
settings

MachineShop Settings
expand_modelgrid

Model Tuning Grid Expansion
SelectedModel

Selected Model
inputs

Model Inputs
SelectedInput

Selected Model Inputs
resample

Resample Estimation of Model Performance
response

Extract Response Variable
summary

Model Performance Summaries
step_spca

Sparse Principal Components Analysis Variable Reduction
step_lincomp

Linear Components Variable Reduction
set_predict

Resampling Prediction Control
lift

Model Lift Curves
performance_curve

Model Performance Curves
step_sbf

Variable Selection by Filtering
set_optim

Tuning Parameter Optimization
performance

Model Performance Metrics
set_monitor

Training Parameters Monitoring Control
t.test

Paired t-Tests for Model Comparisons
unMLModelFit

Revert an MLModelFit Object
varimp

Variable Importance
DiscreteVariate

Discrete Variate Constructors
BARTMachineModel

Bayesian Additive Regression Trees Model
AdaBoostModel

Boosting with Classification Trees
AdaBagModel

Bagging with Classification Trees
BlackBoostModel

Gradient Boosting with Regression Trees
C50Model

C5.0 Decision Trees and Rule-Based Model
EarthModel

Multivariate Adaptive Regression Splines Model
BARTModel

Bayesian Additive Regression Trees Model
CForestModel

Conditional Random Forest Model
CoxModel

Proportional Hazards Regression Model