parallelComputeDistMat: Paralleize computing a distance matrix for functional observations

Description

Uses parallelMap to parallelize the computation of the distance matrix. This is done by dividing the data into batches and computing the distance matrix for each batch. For details on distance computation see computeDistMat.

Usage

parallelComputeDistMat(x, y = NULL, method = "Euclidean", batches = 1L,
  ...)

Arguments

[matrix] matrix containing the functional observations as rows.

[matrix] see x. The default NULL uses y = x.

method

[character(1)] character string describing the distance function to be used. For a full list execute metricChoices().

Euclidean: equals Lp with p = 2. This is the default.
Lp, Minkowski: the distance for an Lp-space, takes p as an additional argument in ....
Manhattan: equals Lp with p = 1.
supremum, max, maximum: equals Lp with p = Inf. The supremal pointwise difference between the curves.
and ...: all other available measures for dist.
shortEuclidean: Euclidean distance on a limited part of the domain. Additional arguments dmin and dmax can be specified in ..., giving the position of the first and the last point to use of an evenly spaced sequence from 0 to 1 of length length(grid). The default values are dmin = o and dmax = 1, which results in the Euclidean distance on the entire domain.
mean: the absolute similarity of the overall mean values of the observations.
relAreas: the difference of the relation of two areas on parts of the domain given by dmin1 to dmax1 and dmin2 to dmax2. They are defined analogously to dmin and dmax and take the same default values.
jump: the similarity of jump heights at points t1 and t2, i.e. x[t1 * length(x)] - x[t2 * length(x)] for every functional observation x. The points t1 and t2 are the positions in an evenly spaced sequence from 0 to 1 of length length(grid) for which to compare the jump height. The default values are t1 = 0 and t2 = 1.
globMax: the difference of the curves global maxima.
globMin: the difference of the curves global minima.
points: the mean absolute differences at certain observation points .poi, also called "points of impact". These are specified as a vector .poi of arbitrary length with values between 0 and 1, encoding the the index of the points of observations. The default value is .poi = seq(0, 1, length.out = length(grid)), which results in the Manhattan distance.
custom.metric: your own semimetric will be used. Specify your own distance function in the argument custom.metric.
amplitudeDistance,phaseDistance: The amplitude distance or phase distance as described in Srivastava, A. and E. P. Klassen (2016). Functional and Shape Data Analysis. Springer.
FisherRao, elasticMetric: the elastic distance of the square root velocity of the curves as described in Srivastava and Klassen (2016). This equates to the Fisher Rao metric.
elasticDistance: weighted mean of the amplitude and the phase distance using the implementation in elastic.distance. Additional arguments are the numeric the penalization parameters a,b,c for the amplitude distance (a^2) and the phase distance (b^2). The default values are a = 1/2, b = 1. Alternatively c denotes the ratio of 2*a and b. lambda is the additional penalization parameter for the warping allowed before calculating the elastic distance. The default is 1.
rucrdtw, rucred: Dynamic Time Warping Distance and Euclidean Distance from package rucrdtw. Implemented in Boersch-Supan (2016) and originally described in Rakthanmanon et al. (2012).

batches

[integer(1)] Number of roughly equal-sized batches to split data into. The distance computation is then carried out for each batch.

...

additional parameters to the (semi-)metrics.

Value

a matrix of dimensions nrow(x) by nrow(y) containing the distances of the functional observations contained in x and y, if y is specified. Otherwise a matrix containing the distances of all functional observations within x to each other.