Learn R Programming

baguette (version 0.1.1)

bag_tree: General Interface for Bagged Decision Tree Models

Description

bag_tree() is a way to generate a specification of a model before fitting and allows the model to be created using different packages in R. The main arguments for the model are:

  • cost_complexity: The cost/complexity parameter (a.k.a. Cp) used by CART models (rpart only).

  • tree_depth: The maximum depth of a tree (rpart).

  • min_n: The minimum number of data points in a node that are required for the node to be split further.

  • class_cost: A cost value to assign to the class corresponding to the first factor level (for 2-class models, rpart and C5.0 only).

These arguments are converted to their specific names at the time that the model is fit. Other options and argument can be set using set_engine(). If left to their defaults here (NULL), the values are taken from the underlying model functions. If parameters need to be modified, update() can be used in lieu of recreating the object from scratch.

Usage

bag_tree(
  mode = "unknown",
  cost_complexity = 0,
  tree_depth = NULL,
  min_n = 2,
  class_cost = NULL
)

# S3 method for bag_tree update( object, parameters = NULL, cost_complexity = NULL, tree_depth = NULL, min_n = NULL, class_cost = NULL, fresh = FALSE, ... )

Arguments

mode

A single character string for the type of model. Possible values for this model are "unknown", "regression", or "classification".

cost_complexity

A positive number for the the cost/complexity parameter (a.k.a. Cp) used by CART models (rpart only).

tree_depth

An integer for maximum depth of the tree.

min_n

An integer for the minimum number of data points in a node that are required for the node to be split further.

class_cost

A non-negative scalar for a class cost (where a cost of 1 means no extra cost). This is useful for when the first level of the outcome factor is the minority class. If this is not the case, values between zero and one can be used to bias to the second level of the factor.

object

A bagged tree model specification.

parameters

A 1-row tibble or named list with main parameters to update. If the individual arguments are used, these will supersede the values in parameters. Also, using engine arguments in this object will result in an error.

fresh

A logical for whether the arguments should be modified in-place of or replaced wholesale.

...

Not used for update().

Details

The model can be created using the fit() function using the following engines:

  • R: "rpart" (the default) or "C5.0" (classification only)

Note that, for rpart models, both cost_complexity and tree_depth can be specified but the package will give precedence to cost_complexity. Also, for tree_depth values greater than 30 rpart will give nonsense results on 32-bit machines.

Examples

Run this code
# NOT RUN {
library(parsnip)

set.seed(9952)
bag_tree(tree_depth = 5) %>%
  set_mode("classification") %>%
  set_engine("rpart", times = 3) %>%
  fit(Species ~ ., data = iris)


model <- bag_tree(cost_complexity = 0.001, min_n = 3)
model
update(model, cost_complexity = 0.1)
update(model, cost_complexity = 0.1, fresh = TRUE)
# }

Run the code above in your browser using DataLab