Learn R Programming

dataPreparation (version 0.4.3)

fastHandleNa: Handle NA values

Description

Handle NAs values depending on the class of the column.

Usage

fastHandleNa(
  dataSet,
  set_num = 0,
  set_logical = FALSE,
  set_char = "",
  verbose = TRUE
)

Arguments

dataSet

Matrix, data.frame or data.table

set_num

NAs replacement for numeric column, (numeric or function, default to 0)

set_logical

NAs replacement for logical column, (logical or function, default to FALSE)

set_char

NAs replacement for character column, (character or function, default to "")

verbose

Should the algorithm talk (logical, default to TRUE)

Value

dataSet as a data.table with NAs replaced.

Details

To preserve RAM this function edits dataSet by reference. To keep object unchanged, please use copy. If you provide a function, it will be applied to the full column. So this function should handle NAs. For factor columns, it will add NA to list of values.

Examples

Run this code
# NOT RUN {
# Build a useful dataSet set for example
require(data.table)
dataSet <- data.table(numCol = c(1, 2, 3, NA),
                   charCol = c("", "a", NA, "c"),
                   booleanCol = c(TRUE, NA, FALSE, NA))

# To set NAs to 0, FALSE and "" (respectively for numeric, logical, character)
fastHandleNa(copy(dataSet))

# In a numeric column to set NAs as "missing"
fastHandleNa(copy(dataSet), set_char = "missing")

# In a numeric column, to set NAs to the minimum value of the column#'                    
fastHandleNa(copy(dataSet), set_num = min) # Won't work because min(c(1, NA)) = NA so put back NA
fastHandleNa(copy(dataSet), set_num = function(x)min(x,na.rm = TRUE)) # Now we handle NAs

# In a numeric column, to set NAs to the share of NAs values
rateNA <- function(x){sum(is.na(x)) / length(x)}
fastHandleNa(copy(dataSet), set_num = rateNA) 

# }

Run the code above in your browser using DataLab