Learn R Programming

lares (version 5.2.13)

ohe_commas: One Hot Encoding for a Vector with Comma Separated Values

Description

This function lets the user do one hot encoding on a variable with comma separated values

Usage

ohe_commas(df, ..., sep = ",", noval = "NoVal", remove = FALSE)

Value

data.frame on which all features are numerical by nature or transformed with one hot encoding.

Arguments

df

Dataframe. May contain one or more columns with comma separated values which will be separated as one hot encoding

...

Variables. Which variables to split into new columns?

sep

Character. Which regular expression separates the elements?

noval

Character. No value text

remove

Boolean. Remove original variables?

See Also

Other Data Wrangling: balance_data(), categ_reducer(), cleanText(), date_cuts(), date_feats(), file_name(), formatHTML(), holidays(), impute(), left(), normalize(), num_abbr(), ohse(), quants(), removenacols(), replaceall(), replacefactor(), textFeats(), textTokenizer(), vector2text(), year_month(), zerovar()

Other One Hot Encoding: date_feats(), holidays(), ohse()

Examples

Run this code
df <- data.frame(
  id = c(1:5),
  x = c("AA, D", "AA,B", "B,  D", "A,D,B", NA),
  z = c("AA+BB+AA", "AA", "BB,  AA", NA, "BB+AA")
)
ohe_commas(df, x, remove = TRUE)
ohe_commas(df, z, sep = "\\+")
ohe_commas(df, x, z)

Run the code above in your browser using DataLab