Learn R Programming

rearrr (version 0.3.5)

expand_distances_each: Expand the distances to an origin in each dimension

Description

lifecycle::badge("experimental")

Moves the data points in n-dimensional space such that their distance to the specified origin is increased/decreased in each dimension separately. A `multiplier` greater than 1 leads to expansion, while a positive `multiplier` lower than 1 leads to contraction.

The origin can be supplied as coordinates or as a function that returns coordinates. The latter can be useful when supplying a grouped data.frame and expanding around e.g. the centroid of each group.

The multipliers/exponents can be supplied as constant(s) or as a function that returns constants. The latter can be useful when supplying a grouped data.frame and the multiplier/exponent depends on the data in the groups. If supplying multiple constants, there must be one per dimension (length of `cols`).

For expansion of the multidimensional distance, use expand_distances().

NOTE: When exponentiating, the default is to first add 1 or -1 (depending on the sign of the distance) to the distances, to ensure expansion even when the distance is between -1 and 1. If you need the purely exponentiated distances, disable `add_one_exp`.

Usage

expand_distances_each(
  data,
  cols = NULL,
  multipliers = NULL,
  multipliers_fn = NULL,
  origin = NULL,
  origin_fn = NULL,
  exponentiate = FALSE,
  add_one_exp = TRUE,
  suffix = "_expanded",
  keep_original = TRUE,
  mult_col_name = ifelse(isTRUE(exponentiate), ".exponents", ".multipliers"),
  origin_col_name = ".origin",
  overwrite = FALSE
)

Value

data.frame (tibble) with the expanded columns, along with the applied multiplier/exponent and origin coordinates.

Arguments

data

data.frame or vector.

cols

Names of columns in `data` to expand. Each column is considered a dimension to expand in.

multipliers

Constant(s) to multiply/exponentiate the distance to the origin by. A scalar to use in all dimensions or a vector with one scalar per dimension.

N.B. When `exponentiate` is TRUE, the `multipliers` become exponents.

multipliers_fn

Function for finding the `multipliers`.

Input: Each column will be passed as a vector in the order of `cols`.

Output: A numeric vector with one element per dimension.

Just as for `origin_fn`, it can be created with create_origin_fn() if you want to apply the same function to each dimension. See `origin_fn`.

origin

Coordinates of the origin to expand around. A scalar to use in all dimensions or a vector with one scalar per dimension.

N.B. Ignored when `origin_fn` is not NULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as a vector in the order of `cols`.

Output: A vector with one scalar per dimension.

Can be created with create_origin_fn() if you want to apply the same function to each dimension.

E.g. `create_origin_fn(median)` would find the median of each column.

Built-in functions are centroid(), most_centered(), and midrange()

exponentiate

Whether to exponentiate instead of multiplying. (Logical)

add_one_exp

Whether to add the sign (either 1 or -1) before exponentiating to ensure the values don't contract. The added value is subtracted after the exponentiation. (Logical)

Exponentiation becomes:

x <- x + sign(x)

x <- sign(x) * abs(x) ^ multiplier

x <- x - sign(x)

N.B. Ignored when `exponentiate` is FALSE.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e. "") to overwrite the original columns.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

mult_col_name

Name of new column with the multiplier(s). If NULL, no column is added.

origin_col_name

Name of new column with the origin coordinates. If NULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Author

Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk

Details

For each value of each dimension (column), either multiply or exponentiate by the multiplier:

# Multiplication

x <- x * multiplier

# Exponentiation

x <- sign(x) * abs(x) ^ multiplier

Note: By default (when `add_one_exp` is TRUE), we add the sign (1 / -1) of the value before the exponentiation and subtract it afterwards. See `add_one_exp`.

See Also

Other mutate functions: apply_transformation_matrix(), cluster_groups(), dim_values(), expand_distances(), flip_values(), roll_values(), rotate_2d(), rotate_3d(), shear_2d(), shear_3d(), swirl_2d(), swirl_3d()

Other expander functions: expand_distances()

Other distance functions: closest_to(), dim_values(), distance(), expand_distances(), furthest_from(), swirl_2d(), swirl_3d()

Examples

Run this code
# Attach packages
library(rearrr)
library(dplyr)
library(purrr)
has_ggplot <- require(ggplot2)  # Attach if installed

# Set seed
set.seed(1)

# Create a data frame
df <- data.frame(
  "x" = runif(20),
  "y" = runif(20),
  "g" = rep(1:4, each = 5)
)

# Expand values in the two dimensions (x and y)
# With the origin at x=0.5, y=0.5
# We expand x by 2 and y by 4
expand_distances_each(
  data = df,
  cols = c("x", "y"),
  multipliers = c(2, 4),
  origin = c(0.5, 0.5)
)

# Expand values in the two dimensions (x and y)
# With the origin at x=0.5, y=0.5
# We expand both by 3
expand_distances_each(
  data = df,
  cols = c("x", "y"),
  multipliers = 3,
  origin = 0.5
)

# Expand values in one dimension (x)
# With the origin at x=0.5
# We expand by 3
expand_distances_each(
  data = df,
  cols = c("x"),
  multipliers = 3,
  origin = 0.5
)

# Expand x and y around the centroid
# We use exponentiation for a more drastic effect
# The add_one_exp makes sure it expands
# even when x or y is in the range [>-1, <1]
# To compare multiple exponents, we wrap the
# call in purrr::map_dfr
df_expanded <- purrr::map_dfr(
  .x = c(1, 2.0, 3.0, 4.0),
  .f = function(exponent) {
    expand_distances_each(
      data = df,
      cols = c("x", "y"),
      multipliers = exponent,
      origin_fn = centroid,
      exponentiate = TRUE,
      add_one_exp = TRUE
    )
  }
)
df_expanded

# Plot the expansions of x and y around the overall centroid
if (has_ggplot){
  ggplot(df_expanded, aes(x = x_expanded, y = y_expanded, color = factor(.exponents))) +
    geom_vline(
      xintercept = df_expanded[[".origin"]][[1]][[1]],
      size = 0.2, alpha = .4, linetype = "dashed"
    ) +
    geom_hline(
      yintercept = df_expanded[[".origin"]][[1]][[2]],
      size = 0.2, alpha = .4, linetype = "dashed"
    ) +
    geom_point() +
    theme_minimal() +
    labs(x = "x", y = "y", color = "Exponent")
}

# Expand x and y around the centroid using multiplication
# To compare multiple multipliers, we wrap the
# call in purrr::map_dfr
df_expanded <- purrr::map_dfr(
  .x = c(1, 2.0, 3.0, 4.0),
  .f = function(multiplier) {
    expand_distances_each(df,
      cols = c("x", "y"),
      multipliers = multiplier,
      origin_fn = centroid,
      exponentiate = FALSE
    )
  }
)
df_expanded

# Plot the expansions of x and y around the overall centroid
if (has_ggplot){
ggplot(df_expanded, aes(x = x_expanded, y = y_expanded, color = factor(.multipliers))) +
    geom_vline(
      xintercept = df_expanded[[".origin"]][[1]][[1]],
      size = 0.2, alpha = .4, linetype = "dashed"
    ) +
    geom_hline(
      yintercept = df_expanded[[".origin"]][[1]][[2]],
      size = 0.2, alpha = .4, linetype = "dashed"
    ) +
    geom_point() +
    theme_minimal() +
    labs(x = "x", y = "y", color = "Multiplier")
}

# Expand x and y with different multipliers
# around the centroid using multiplication
df_expanded <- expand_distances_each(
  df,
  cols = c("x", "y"),
  multipliers = c(1.25, 10),
  origin_fn = centroid,
  exponentiate = FALSE
)
df_expanded

# Plot the expansions of x and y around the overall centroid
# Note how the y axis is expanded a lot more than the x-axis
if (has_ggplot){
  ggplot(df_expanded, aes(x = x_expanded, y = y_expanded)) +
    geom_vline(
      xintercept = df_expanded[[".origin"]][[1]][[1]],
      size = 0.2, alpha = .4, linetype = "dashed"
    ) +
    geom_hline(
      yintercept = df_expanded[[".origin"]][[1]][[2]],
      size = 0.2, alpha = .4, linetype = "dashed"
    ) +
    geom_line(aes(color = "Expanded")) +
    geom_point(aes(color = "Expanded")) +
    geom_line(aes(x = x, y = y, color = "Original")) +
    geom_point(aes(x = x, y = y, color = "Original")) +
    theme_minimal() +
    labs(x = "x", y = "y", color = "Multiplier")
}

#
# Contraction
#

# Group-wise contraction to create clusters
df_contracted <- df %>%
  dplyr::group_by(g) %>%
  expand_distances_each(
    cols = c("x", "y"),
    multipliers = 0.07,
    suffix = "_contracted",
    origin_fn = centroid
  )

# Plot the clustered data point on top of the original data points
if (has_ggplot){
  ggplot(df_contracted, aes(x = x_contracted, y = y_contracted, color = factor(g))) +
    geom_point(aes(x = x, y = y, color = factor(g)), alpha = 0.3, shape = 16) +
    geom_point() +
    theme_minimal() +
    labs(x = "x", y = "y", color = "g")
}

Run the code above in your browser using DataLab