Learn R Programming

⚠️There's a newer version (1.3.1) of this package.Take me there.

hardhat

Introduction

hardhat is a developer focused package designed to ease the creation of new modeling packages, while simultaneously promoting good R modeling package standards as laid out by the set of opinionated Conventions for R Modeling Packages.

hardhat has four main goals:

  • Easily, consistently, and robustly preprocess data at fit time and prediction time with mold() and forge().

  • Provide one source of truth for common input validation functions, such as checking if new data at prediction time contains the same required columns used at fit time.

  • Provide extra utility functions for additional common tasks, such as adding intercept columns, standardizing predict() output, and extracting valuable class and factor level information from the predictors.

  • Reimagine the base R preprocessing infrastructure of stats::model.matrix() and stats::model.frame() using the stricter approaches found in model_matrix() and model_frame().

The idea is to reduce the burden of creating a good modeling interface as much as possible, and instead let the package developer focus on writing the core implementation of their new model. This benefits not only the developer, but also the user of the modeling package, as the standardization allows users to build a set of “expectations” around what any modeling function should return, and how they should interact with it.

Installation

You can install the released version of hardhat from CRAN with:

install.packages("hardhat")

And the development version from GitHub with:

# install.packages("pak")
pak::pak("tidymodels/hardhat")

Learning more

To learn about how to use hardhat, check out the vignettes:

  • vignette("mold", "hardhat"): Learn how to preprocess data at fit time with mold().

  • vignette("forge", "hardhat"): Learn how to preprocess new data at prediction time with forge().

  • vignette("package", "hardhat"): Learn how to use mold() and forge() to help in creating a new modeling package.

You can also watch Max Kuhn discuss how to use hardhat to build a new modeling package from scratch at the XI Jornadas de Usuarios de R conference here.

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Copy Link

Version

Install

install.packages('hardhat')

Monthly Downloads

119,027

Version

1.3.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

March 30th, 2023

Functions in hardhat (1.3.0)

forge

Forge prediction-ready data
default_formula_blueprint

Default formula blueprint
default_xy_blueprint

Default XY blueprint
fct_encode_one_hot

Encode a factor as a one-hot indicator matrix
extract_ptype

Extract a prototype
frequency_weights

Frequency weights
add_intercept_column

Add an intercept column to data
is_frequency_weights

Is x a frequency weights vector?
get_data_classes

Extract data classes from a data frame or matrix
hardhat-example-data

Example data for hardhat
importance_weights

Importance weights
is_importance_weights

Is x an importance weights vector?
hardhat-package

hardhat: Construct Modeling Packages
get_levels

Extract factor levels from a data frame
hardhat-extract

Generics for object extraction
is_case_weights

Is x a case weights vector?
is_blueprint

Is x a preprocessing blueprint?
modeling-package

Create a modeling package
model_offset

Extract a model offset
model_matrix

Construct a design matrix
model_frame

Construct a model frame
new_frequency_weights

Construct a frequency weights vector
new_importance_weights

Construct an importance weights vector
new_default_formula_blueprint

Create a new default blueprint
new_formula_blueprint

Create a new preprocessing blueprint
new_case_weights

Extend case weights
mold

Mold data for modeling
shrink

Subset only required columns
run-forge

forge() according to a blueprint
refresh_blueprint

Refresh a preprocessing blueprint
spruce-multiple

Spruce up multi-outcome predictions
new_model

Constructor for a base model
run-mold

mold() according to a blueprint
spruce

Spruce up predictions
standardize

Standardize the outcome
scream

"\U0001f631" Scream.
recompose

Recompose a data frame into another form
tune

Mark arguments for tuning
validate_outcomes_are_numeric

Ensure outcomes are all numeric
validate_outcomes_are_univariate

Ensure that the outcome is univariate
validate_no_formula_duplication

Ensure no duplicate terms appear in formula
update_blueprint

Update a preprocessing blueprint
validate_column_names

Ensure that data contains required column names
weighted_table

Weighted table
validate_prediction_size

Ensure that predictions have the correct number of rows
validate_predictors_are_numeric

Ensure predictors are all numeric
validate_outcomes_are_binary

Ensure that the outcome has binary factors
validate_outcomes_are_factors

Ensure that the outcome has only factor columns
delete_response

Delete the response from a terms object
contr_one_hot

Contrast function for one-hot encodings
default_recipe_blueprint

Default recipe blueprint