Learn R Programming

bread (version 0.4.1)

bnumrange: Pre-filters a data file using column numerical range before loading it in memory

Description

Simple wrapper for data.table::fread() allowing to filter data by numerical value from a file with the Unix 'awk' command. This method is useful if you want to load a file too large for your available memory (and encounter the 'cannot allocate vector of size' error #' for example).

Usage

bnumrange(
  file = NULL,
  range_min = NULL,
  range_max = NULL,
  numrange_columns = NULL,
  ...
)

Value

A dataframe

Arguments

file

String. Name or full path to a file compatible with data.table::fread()

range_min

Vector of numeric. One or several minimal values used to filter (inclusively, as in superior OR EQUAL to that value) the data from the input file. Each element of the vector should correspond to the numrange_column to be filtered.

range_max

Vector of numeric. One or several maximal values used to filter (inclusively, as in inferior OR EQUAL to that value) the data from the input file. Each element of the vector should correspond to the numrange_column to be filtered.

numrange_columns

Vector of strings or numeric. The columns to be filtered should be indicated through their names or their index number. Each element of the vector should correspond to the range_min and range_man values with which it will be filtered.

...

Arguments that must be passed to data.table::fread() like 'sep' and 'dec'.

Warning

The value comparisons are inclusive, meaning inferior/superior OR EQUAL

Examples

Run this code
file <- system.file('extdata', 'test.csv', package = 'bread')

## Filtering with only min value


## Filtering on 2 columns
bnumrange(file = file, range_min = c(2006, 1500), range_max = c(2010, 1990),
      numrange_columns = c(1,3))
bnumrange(file = file, range_min = c(2000, 1500), range_max = c(2005, 1990),
      numrange_columns = c('YEAR', 'PRICE'), sep = ';')

Run the code above in your browser using DataLab