Learn R Programming

dataPreparation (version 0.4.3)

one_hot_encoder: One hot encoder

Description

Transform factor column into 0/1 columns with one column per values of the column.

Usage

one_hot_encoder(
  dataSet,
  encoding = NULL,
  type = "integer",
  verbose = TRUE,
  drop = FALSE
)

Arguments

dataSet

Matrix, data.frame or data.table

encoding

Result of funcion build_encoding, (list, default to NULL). To perform the same encoding on train and test, it is recommended to compute build_encoding before. If it is kept to NULL, build_encoding will be called.

type

What class of columns is expected? "integer" (0L/1L), "numeric" (0/1), or "logical" (TRUE/FALSE), (character, default to "integer")

verbose

Should the function log (logical, default to TRUE)

drop

Should cols be dropped after generation (logical, default to FALSE)

Value

dataSet edited by reference with new columns.

Details

If you don't want to edit your data set consider sending copy(dataSet) as an input. Please be carefull using this function, it will generate as many columns as there different values in your column and might use a lot of RAM. To be safe, you can use parameter min_frequency in build_encoding.

Examples

Run this code
# NOT RUN {
data(messy_adult)

# Compute encoding
encoding <- build_encoding(messy_adult, cols = c("marital", "occupation"), verbose = TRUE)

# Apply it
messy_adult <- one_hot_encoder(messy_adult, encoding = encoding, drop = TRUE)

# Apply same encoding to adult
data(adult)
adult <- one_hot_encoder(adult, encoding = encoding, drop = TRUE)

# To have encoding as logical (TRUE/FALSE), pass it in type argument
data(adult)
adult <- one_hot_encoder(adult, encoding = encoding, type = "logical", drop = TRUE)
# }

Run the code above in your browser using DataLab