Learn R Programming

MESS (version 0.5.12)

cumsumbinning: Binning based on cumulative sum with reset above threshold

Description

Fast binning of cumulative vector sum with new groups when the sum passes a threshold or the group size becomes too large

Usage

cumsumbinning(x, threshold, cutwhenpassed = FALSE, maxgroupsize = NULL)

Value

An integer vector giving the group indices

Arguments

x

A matrix of regressor variables. Must have the same number of rows as the length of y.

threshold

The value of the threshold that the cumulative group sum must not cross OR the threshold that each group sum must pass (when the argument cuwhatpassed is set to TRUE).

cutwhenpassed

A boolean. Should the threshold be the upper limit of the group sum (the default) or the value that each group sum needs to pass (when set to TRUE).

maxgroupsize

An integer that defines the maximum number of elements in each group. NAs count as part of each group but do not add to the group sum. NULL (the default) corresponds to no group size limits.

Author

Claus Ekstrom <claus@rprimer.dk>

Details

Missing values (NA, Inf, NaN) are completely disregarded and pairwise complete cases are used f

Examples

Run this code

set.seed(1)
x <- sample(10, 20, replace = TRUE)
cumsumbinning(x, 15)
cumsumbinning(x, 15, 3)

x <- c(3, 4, 5, 12, 1, 5, 3)
cumsumbinning(x, 10)
cumsumbinning(x, 10, cutwhenpassed=TRUE)

Run the code above in your browser using DataLab