Learn R Programming

dataPreparation (version 0.4.3)

fastDiscretization: Discretization

Description

Discretization of numeric variable (either equal_width or equal_fred).

Usage

fastDiscretization(dataSet, bins = NULL, verbose = TRUE)

Arguments

dataSet

Matrix, data.frame or data.table

bins

Result of funcion build_bins, (list, default to NULL). To perform the same discretization on train and test, it is recommended to compute build_bins before. If it is kept to NULL, build_bins will be called. bins could also be carefully hand written.

verbose

Should the algorithm talk? (Logical, default to TRUE)

Value

Same dataset discretized by reference. If you don't want to edit by reference please provide set dataSet = copy(dataSet).

Details

NAs will be putted in an NA category.

Examples

Run this code
# NOT RUN {
# Load data
data(messy_adult)
head(messy_adult)

# Compute bins
bins <- build_bins(messy_adult, cols = "auto", n_bins = 5, type = "equal_freq")

# Discretize
messy_adult <- fastDiscretization(messy_adult, bins = bins)

# Control
head(messy_adult)

# Example with hand written bins
data("adult")
adult <-  fastDiscretization(adult, bins = list(age = c(0, 40, +Inf)))
print(table(adult$age))
# }

Run the code above in your browser using DataLab