Learn R Programming

collapse (version 2.1.1)

fdist: Fast and Flexible Distance Computations

Description

A fast and flexible replacement for dist, to compute euclidean distances.

Usage

fdist(x, v = NULL, ..., method = "euclidean", nthreads = .op[["nthreads"]])

Value

If v = NULL, a full lower-triangular distance matrix between the rows of x is computed and returned as a 'dist' object (all methods apply, see dist). Otherwise, a numeric vector of distances of each row of x with v is returned. See Examples.

Arguments

x

a numeric vector or matrix. Data frames/lists can be passed but will be converted to matrix using qM. Non-numeric (double) inputs will be coerced.

v

an (optional) numeric (double) vector such that length(v) == NCOL(x), to compute distances with (the rows of) x. Other vector types will be coerced.

...

not used. A placeholder for possible future arguments.

method

an integer or character string indicating the method of computing distances.

Int. String Description
1"euclidean"euclidean distance
2"euclidean_squared"squared euclidean distance (more efficient)

nthreads

integer. The number of threads to use. If v = NULL (full distance matrix), multithreading is along the distance matrix columns (decreasing thread loads as matrix is lower triangular). If v is supplied, multithreading is at the sub-column level (across elements).

See Also

flm, Fast Statistical Functions, Collapse Overview

Examples

Run this code
# Distance matrix
m = as.matrix(mtcars)
str(fdist(m)) # Same as dist(m)

# Distance with vector
d = fdist(m, fmean(m))
kit::topn(d, 5)  # Index of 5 nearest neighbours

# Mahalanobis distance
m_mahal = t(forwardsolve(t(chol(cov(m))), t(m)))
fdist(m_mahal, fmean(m_mahal))
sqrt(unattrib(mahalanobis(m, fmean(m), cov(m))))
# \donttest{
# Distance of two vectors
x <- rnorm(1e6)
y <- rnorm(1e6)
microbenchmark::microbenchmark(
  fdist(x, y),
  fdist(x, y, nthreads = 2),
  sqrt(sum((x-y)^2))
)
# }

Run the code above in your browser using DataLab