Learn R Programming

tidyfinance (version 0.4.3)

compute_breakpoints: Compute Breakpoints Based on Sorting Variable

Description

[Experimental]

This function computes breakpoints based on a specified sorting. It can optionally filter the data by exchanges before computing the breakpoints. The function requires either the number of portfolios to be created or specific percentiles for the breakpoints, but not both. The function also optionally handles cases where the sorting variable clusters on the edges, by assigning all extreme values to the edges and attempting to compute equally populated breakpoints with the remaining values.

Usage

compute_breakpoints(
  data,
  sorting_variable,
  breakpoint_options,
  data_options = NULL
)

Value

A vector of breakpoints of the desired length.

Arguments

data

A data frame containing the dataset for breakpoint computation.

sorting_variable

A string specifying the column name in data to be used for determining breakpoints.

breakpoint_options

A named list of breakpoint_options for the breakpoints. The arguments include

  • n_portfolios An optional integer specifying the number of equally sized portfolios to create. This parameter is mutually exclusive with percentiles.

  • percentiles An optional numeric vector specifying the percentiles for determining the breakpoints of the portfolios. This parameter is mutually exclusive with n_portfolios.

  • breakpoint_exchanges An optional character vector specifying exchange names to filter the data before computing breakpoints. Exchanges must be stored in a column named exchange in data. If NULL, no filtering is applied.

  • smooth_bunching An optional logical parameter specifying if to attempt smoothing non-extreme portfolios if the sorting variable bunches on the extremes (TRUE, the default), or not (FALSE). In some cases, smoothing will not result in equal-sized portfolios off the edges due to multiple clusters. If sufficiently large bunching is detected, percentiles is ignored and equally-spaced portfolios are returned for these cases with a warning.

data_options

A named list of data_options with characters, indicating the column names required to run this function. The required column names identify dates. Defaults to exchange = exchange.

Examples

Run this code
data <- data.frame(
  id = 1:100,
  exchange = sample(c("NYSE", "NASDAQ"), 100, replace = TRUE),
  market_cap = 1:100
)

compute_breakpoints(data, "market_cap", breakpoint_options(n_portfolios = 5))
compute_breakpoints(
  data, "market_cap",
  breakpoint_options(percentiles = c(0.2, 0.4, 0.6, 0.8), breakpoint_exchanges = c("NYSE"))
 )

Run the code above in your browser using DataLab