Learn R Programming

modellingTools (version 0.1.0)

vector_bin: Bin a vector into equal height, equal width, or custom bins

Description

This function essentially calls cut/ cut_interval/cut_number, depending on the value of bins and type. The one major difference is in the treatment of missing values; those functions return NA, while vector_bin has the default option of returning a bin for the missing values

Usage

vector_bin(x, bins, type = "height", na_include = TRUE)

Arguments

x
vector of numeric data to bin
bins
numeric vector. If length 1, then this is taken to be the number of desired bins, computed according to "type". If length > 1, this is taken to be the actual cutpoints desired
type
character, equal to "height" or "width". Only used if length(bins) == 1. If "height", then bins are computed to have roughly equal numbers of observations; else, bins are computed to be of roughly equal width
na_include
logical. If TRUE, then a bin labelled "missing" will be included in the output. Else NA values are dropped

Value

the input vector x, with values replaced by the appropriate bins. Type also changed to factor

See Also

cut, cut_number, cut_interval,

Other discretization: binned_data_cutpoints, get_vector_cutpoints, simple_bin

Examples

Run this code
x <- rnorm(100)
y <- x; y[sample(1:100,20)] <- NA

cut(x,c(-1,0,1))
vector_bin(x,bins = c(-1,0,1))
cut(y,c(-1,0,1))
vector_bin(y,bins = c(-1,0,1))
vector_bin(y,bins = c(-1,0,1),na_include = FALSE)

ggplot2::cut_number(x,5)
vector_bin(x,5)

ggplot2::cut_interval(x,5)
vector_bin(x,5,type = "width")

Run the code above in your browser using DataLab