This page contains the constructors for the default blueprints. They can be
extended if you want to add extra behavior on top of what the default
blueprints already do, but generally you will extend the non-default versions
of the constructors found in the documentation for new_blueprint()
.
new_default_formula_blueprint(
intercept = FALSE,
allow_novel_levels = FALSE,
ptypes = NULL,
formula = NULL,
indicators = "traditional",
composition = "tibble",
terms = list(predictors = NULL, outcomes = NULL),
levels = NULL,
...,
subclass = character()
)new_default_recipe_blueprint(
intercept = FALSE,
allow_novel_levels = FALSE,
fresh = TRUE,
strings_as_factors = TRUE,
composition = "tibble",
ptypes = NULL,
recipe = NULL,
extra_role_ptypes = NULL,
...,
subclass = character()
)
new_default_xy_blueprint(
intercept = FALSE,
allow_novel_levels = FALSE,
composition = "tibble",
ptypes = NULL,
...,
subclass = character()
)
A logical. Should an intercept be included in the
processed data? This information is used by the process
function
in the mold
and forge
function list.
A logical. Should novel factor levels be allowed at
prediction time? This information is used by the clean
function in the
forge
function list, and is passed on to scream()
.
Either NULL
, or a named list with 2 elements, predictors
and outcomes
, both of which are 0-row tibbles. ptypes
is generated
automatically at mold()
time and is used to validate new_data
at
prediction time.
Either NULL
, or a formula that specifies how the
predictors and outcomes should be preprocessed. This argument is set
automatically at mold()
time.
A single character string. Control how factors are expanded into dummy variable indicator columns. One of:
"traditional"
- The default. Create dummy variables using the
traditional model.matrix()
infrastructure. Generally this creates
K - 1
indicator columns for each factor, where K
is the number of
levels in that factor.
"none"
- Leave factor variables alone. No expansion is done.
"one_hot"
- Create dummy variables using a one-hot encoding approach
that expands unordered factors into all K
indicator columns, rather than
K - 1
.
Either "tibble", "matrix", or "dgCMatrix" for the format of the processed predictors. If "matrix" or "dgCMatrix" are chosen, all of the predictors must be numeric after the preprocessing method has been applied; otherwise an error is thrown.
A named list of two elements, predictors
and outcomes
. Both
elements are terms
objects that describe the terms for the outcomes and
predictors separately. This argument is set automatically at mold()
time.
Either NULL
or a named list of character vectors that
correspond to the levels observed when converting character predictor columns
to factors during mold()
. This argument is set automatically at mold()
time.
Name-value pairs for additional elements of blueprints that subclass this blueprint.
A character vector. The subclasses of this blueprint.
Should already trained operations be re-trained when prep()
is
called?
Should character columns be converted to factors
when prep()
is called?
Either NULL
, or an unprepped recipe. This argument is set
automatically at mold()
time.
A named list. The names are the unique non-standard
recipe roles (i.e. everything except "predictors"
and "outcomes"
). The
values are prototypes of the original columns with that role. These are
used for validation in forge()
.