Learn R Programming

dataPreparation (version 0.4.3)

fastFilterVariables: Filtering useless variables

Description

Delete columns that are constant or in double in your dataSet set.

Usage

fastFilterVariables(dataSet, level = 3, keep_cols = NULL, verbose = TRUE, ...)

Arguments

dataSet

Matrix, data.frame or data.table

level

which columns do you want to filter (1 = constant, 2 = constant and doubles, 3 = constant doubles and bijections, 4 = constant doubles bijections and included)(numeric, default to 3)

keep_cols

List of columns not to drop (list of character, default to NULL)

verbose

Should the algorithm talk (logical or 1 or 2, default to TRUE)

...

optional parameters to be passed to the function when called from another function

Value

The same dataSet but with fewer columns. Columns that are constant, in double, or bijection of another have been deleted.

Details

verbose can be set to 2 have full details from which functions, otherwise they don't log. (verbose = 1 is equivalent to verbose = TRUE).

Examples

Run this code
# NOT RUN {
# First let's build a data.frame with 3 columns: a constant column, and a column in double
df <- data.frame(col1 = 1, col2 = rnorm(1e6), col3 = sample(c(1, 2), 1e6, replace = TRUE))
df$col4 <- df$col2
df$col5[df$col3 == 1] = "a"
df$col5[df$col3 == 2] = "b" # Same info than in col1 but with a for 1 and b for 2
head(df)

# Let's filter columns:
df <- fastFilterVariables(df)
head(df)
# }

Run the code above in your browser using DataLab