Learn R Programming

rearrr (version 0.3.4)

dim_values: Dim values of a dimension based on the distance to an n-dimensional origin

Description

lifecycle::badge("experimental")

Dims the values in the dimming dimension (last by default) based on the data point's distance to the origin.

Distance is calculated as: $$d(P1, P2) = sqrt( (x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2 + ... )$$

The default `dimming_fn` multiplies by the inverse-square of \(1 + distance\) and is calculated as: $$dimming_fn(x, d) = x * (1 / (1 + d) ^ 2)$$

Where \(x\) is the value in the dimming dimension. The \(+1\) is added to ensure that values are dimmed even when the distance is below 1. The quickest way to change the exponent or the \(+1\) is with create_dimming_fn().

The origin can be supplied as coordinates or as a function that returns coordinates. The latter can be useful when supplying a grouped data.frame and dimming around e.g. the centroid of each group.

Usage

dim_values(
  data,
  cols,
  dimming_fn = create_dimming_fn(numerator = 1, exponent = 2, add_to_distance = 1),
  origin = NULL,
  origin_fn = NULL,
  dim_col = tail(cols, 1),
  suffix = "_dimmed",
  keep_original = TRUE,
  origin_col_name = ".origin",
  overwrite = FALSE
)

Value

data.frame (tibble) with the dimmed column, along with the origin coordinates.

Arguments

data

data.frame or vector.

cols

Names of columns in `data` to calculate distances from. The dimming column (`dim_col`) is dimmed based on all the columns. Each column is considered a dimension.

N.B. when the dimming dimension is included in `cols`, it is used in the distance calculation as well.

dimming_fn

Function for calculating the dimmed values.

Input: Two (2) input arguments:

  1. A numeric vector with the values in the dimming dimension.

  2. A numeric vector with corresponding distances to the origin.

Output: A numeric vector with the same length as the input vectors.

E.g.:

function(x, d){

x * (1 / ((1 + d) ^ 2))

}

This kind of dimming function can be created with create_dimming_fn(), which for instance makes it easy to change the exponent (the 2 above).

origin

Coordinates of the origin to dim around. A scalar to use in all dimensions or a vector with one scalar per dimension.

N.B. Ignored when `origin_fn` is not NULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as a vector in the order of `cols`.

Output: A vector with one scalar per dimension.

Can be created with create_origin_fn() if you want to apply the same function to each dimension.

E.g. `create_origin_fn(median)` would find the median of each column.

Built-in functions are centroid(), most_centered(), and midrange()

dim_col

Name of column to dim. Default is the last column in `cols`.

When the `dim_col` is not present in `cols`, it is not used in the distance calculation.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e. "") to overwrite the original columns.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

origin_col_name

Name of new column with the origin coordinates. If NULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Author

Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk

Details

  • Calculates distances to origin with: $$d(P1, P2) = sqrt( (x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2 + ... )$$

  • Applies the `dimming_fn` to the `dim_col` based on the distances.

See Also

Other mutate functions: apply_transformation_matrix(), cluster_groups(), expand_distances(), expand_distances_each(), flip_values(), roll_values(), rotate_2d(), rotate_3d(), shear_2d(), shear_3d(), swirl_2d(), swirl_3d()

Other distance functions: closest_to(), distance(), expand_distances(), expand_distances_each(), furthest_from(), swirl_2d(), swirl_3d()

Examples

Run this code
# Attach packages
library(rearrr)
library(dplyr)
library(purrr)
has_ggplot <- require(ggplot2)  # Attach if installed

# Set seed
set.seed(7)

# Create a data frame with clusters
df <- generate_clusters(
  num_rows = 70,
  num_cols = 3,
  num_clusters = 5,
  compactness = 1.6
) %>%
  dplyr::rename(x = D1, y = D2, z = D3) %>%
  dplyr::mutate(o = 1)

# Dim the values in the z column
dim_values(
  data = df,
  cols = c("x", "y", "z"),
  origin = c(0.5, 0.5, 0.5)
)

# Dim the values in the `o` column (all 1s)
# around the centroid
dim_values(
  data = df,
  cols = c("x", "y"),
  dim_col = "o",
  origin_fn = centroid
)

# Specify dimming_fn
# around the centroid
dim_values(
  data = df,
  cols = c("x", "y"),
  dim_col = "o",
  origin_fn = centroid,
  dimming_fn = function(x, d) {
    x * 1 / (2^(1 + d))
  }
)

#
# Dim cluster-wise
#

# Group-wise dimming
df_dimmed <- df %>%
  dplyr::group_by(.cluster) %>%
  dim_values(
    cols = c("x", "y"),
    dim_col = "o",
    origin_fn = centroid
  )

# Plot the dimmed data such that the alpha (opacity) is
# controlled by the dimming
# (Note: This works because the `o` column is 1 for all values)
if (has_ggplot){
  ggplot(
    data = df_dimmed,
    aes(x = x, y = y, alpha = o_dimmed, color = .cluster)
  ) +
    geom_point() +
    theme_minimal() +
    labs(x = "x", y = "y", color = "Cluster", alpha = "o_dimmed")
}

Run the code above in your browser using DataLab