Learn R Programming

MachineShop (version 3.8.0)

MLModel: MLModel and MLModelFunction Class Constructors

Description

Create a model or model function for use with the MachineShop package.

Usage

MLModel(
  name = "MLModel",
  label = name,
  packages = character(),
  response_types = character(),
  weights = FALSE,
  predictor_encoding = c(NA, "model.frame", "model.matrix"),
  na.rm = FALSE,
  params = list(),
  gridinfo = tibble::tibble(param = character(), get_values = list(), default =
    logical()),
  fit = function(formula, data, weights, ...) stop("No fit function."),
  predict = function(object, newdata, times, ...) stop("No predict function."),
  varimp = function(object, ...) NULL,
  ...
)

MLModelFunction(object, ...)

Value

An MLModel or MLModelFunction class object.

Arguments

name

character name of the object to which the model is assigned.

label

optional character descriptor for the model.

packages

character vector of package names upon which the model depends. Each name may be optionally followed by a comment in parentheses specifying a version requirement. The comment should contain a comparison operator, whitespace and a valid version number, e.g. "xgboost (>= 1.3.0)".

response_types

character vector of response variable types to which the model can be fit. Supported types are "binary", "BinomialVariate", "DiscreteVariate", "factor", "matrix", "NegBinomialVariate", "numeric", "ordered", "PoissonVariate", and "Surv".

weights

logical value or vector of the same length as response_types indicating whether case weights are supported for the responses.

predictor_encoding

character string indicating whether the model is fit with predictor variables encoded as a "model.frame", a "model.matrix", or unspecified (default).

na.rm

character string or logical specifying removal of "all" (TRUE) cases with missing values from model fitting and prediction, "none" (FALSE), or only those whose missing values are in the "response" variable.

params

list of user-specified model parameters to be passed to the fit function.

gridinfo

tibble of information for construction of tuning grids consisting of a character column param with the names of parameters in the grid, a list column get_values with functions to generate grid points for the corresponding parameters, and an optional logical column default indicating which parameters to include by default in regular grids. Values functions may optionally include arguments n and data for the number of grid points to generate and a ModelFrame of the model fit data and formula, respectively; and must include an ellipsis (...).

fit

model fitting function whose arguments are a formula, a ModelFrame named data, case weights, and an ellipsis.

predict

model prediction function whose arguments are the object returned by fit, a ModelFrame named newdata of predictor variables, optional vector of times at which to predict survival, and an ellipsis.

varimp

variable importance function whose arguments are the object returned by fit, optional arguments passed from calls to varimp, and an ellipsis.

...

arguments passed to other methods.

object

function that returns an MLModel object when called without any supplied argument values.

Details

If supplied, the grid function should return a list whose elements are named after and contain values of parameters to include in a tuning grid to be constructed automatically by the package.

Arguments data and newdata in the fit and predict functions may be converted to data frames with as.data.frame() if needed for their operation. The fit function should return the object resulting from the model fit. Values returned by the predict functions should be formatted according to the response variable types below.

factor

matrix whose columns contain the probabilities for multi-level factors or vector of probabilities for the second level of binary factors.

matrix

matrix of predicted responses.

numeric

vector or column matrix of predicted responses.

Surv

matrix whose columns contain survival probabilities at times if supplied or a vector of predicted survival means otherwise.

The varimp function should return a vector of importance values named after the predictor variables or a matrix or data frame whose rows are named after the predictors.

The predict and varimp functions are additionally passed a list named .MachineShop containing the input and model from fit. This argument may be included in the function definitions as needed for their implementations. Otherwise, it will be captured by the ellipsis.

See Also

models, fit, resample

Examples

Run this code
## Logistic regression model
LogisticModel <- MLModel(
  name = "LogisticModel",
  response_types = "binary",
  weights = TRUE,
  fit = function(formula, data, weights, ...) {
    glm(formula, data = as.data.frame(data), weights = weights,
        family = binomial, ...)
  },
  predict = function(object, newdata, ...) {
    predict(object, newdata = as.data.frame(newdata), type = "response")
  },
  varimp = function(object, ...) {
    pchisq(coef(object)^2 / diag(vcov(object)), 1)
  }
)

data(Pima.tr, package = "MASS")
res <- resample(type ~ ., data = Pima.tr, model = LogisticModel)
summary(res)

Run the code above in your browser using DataLab