Learn R Programming

modellingTools (version 0.1.0)

create_model_matrix: Create a usable model matrix from a data frame containing a mix of continuous and categorical variables

Description

This function takes your dataframe of input variables and returns a new dataframe (or matrix) with the categorical variables replaced by dummy variables, using model.matrix

Usage

create_model_matrix(dat, id = c(), matrix_out = TRUE, parallel = FALSE)

Arguments

dat
a tbl
id
character, naming the variable in dat which serves as the unique row identifier. If blank, will be created
matrix_out
logical. Should the result be a matrix (TRUE), suitable for input into many modelling functions, or should the result be a tbl (FALSE), suitible for inspection and further analysis? Default TRUE
parallel
logical. If TRUE, parallel foreach is used to compute on each variable. Must register a parallel backend first. Default FALSE.

Value

a matrix or a tbl, consisting of dummy columns with 0/1 indicators of membership in each factor level for each factor variable, and all other input variables unchanged.

Details

The function will only alter variables which are type factor. Contrary to how it may sound, this actually offers the user greater flexibility, for two reasons: it allows you to keep character type variables intact, and it forces you to think about the levels of each factor variable rather than picking them straight from the input data

Examples

Run this code
x <- simple_bin(iris,bins = 3)
create_model_matrix(x)
create_model_matrix(x,matrix_out = FALSE)

Run the code above in your browser using DataLab