These are parameter generating functions that can be used for modeling, especially in conjunction with the parsnip package.
trees(range = c(1L, 2000L), trans = NULL)min_n(range = c(2L, 40L), trans = NULL)
sample_size(range = c(unknown(), unknown()), trans = NULL)
sample_prop(range = c(1/10, 1), trans = NULL)
loss_reduction(range = c(-10, 1.5), trans = log10_trans())
tree_depth(range = c(1L, 15L), trans = NULL)
prune(values = c(TRUE, FALSE))
cost_complexity(range = c(-10, -1), trans = log10_trans())
A two-element vector holding the defaults for the smallest and largest possible values, respectively.
A trans
object from the scales
package, such as
scales::log10_trans()
or scales::reciprocal_trans()
. If not provided,
the default is used which matches the units used in range
. If no
transformation, NULL
.
A vector of possible values (TRUE
or FALSE
).
These functions generate parameters that are useful when the model is based on trees or rules.
trees()
: The number of trees contained in a random forest or boosted
ensemble. In the latter case, this is equal to the number of boosting
iterations. (See parsnip::rand_forest()
and parsnip::boost_tree()
).
min_n()
: The minimum number of data points in a node that are required
for the node to be split further. (See parsnip::rand_forest()
and
parsnip::boost_tree()
).
sample_size()
: The size of the data set used for modeling within an
iteration of the modeling algorithm, such as stochastic gradient boosting.
(See parsnip::boost_tree()
).
sample_prop()
: The same as sample_size()
but as a proporiton of the
total sample.
loss_reduction()
: The reduction in the loss function required to split
further. (See parsnip::boost_tree()
). This corresponds to gamma
in
xgboost.
tree_depth()
: The maximum depth of the tree (i.e. number of splits).
(See parsnip::boost_tree()
).
prune()
: A logical for whether a tree or set of rules should be pruned.
cost_complexity()
: The cost-complexity parameter in classical CART models.
# NOT RUN {
trees()
min_n()
sample_size()
loss_reduction()
tree_depth()
prune()
cost_complexity()
# }
Run the code above in your browser using DataLab