This mainly internal function offers a unified framework to access the
dist
function from the proxy
package and additional
(semi-)metrics.
computeDistMat(x, y = NULL, method = "Euclidean", dmin = 0, dmax = 1,
dmin1 = 0, dmax1 = 1, dmin2 = 0, dmax2 = 1, t1 = 0, t2 = 1,
.poi = seq(0, 1, length.out = ncol(x)), custom.metric = function(x, y, lp
= 2, ...) { return(sum(abs(x - y)^lp)^(1/lp)) }, a = NULL, b = NULL,
c = NULL, lambda = 0, ...)
[matrix
]
matrix containing the functional observations as rows.
[matrix
]
see x
. The default NULL
uses y = x
.
[character(1)
]
character string describing the distance function to be used. For a full list
execute metricChoices()
.
Euclidean
equals Lp
with p = 2
. This is the default.
Lp, Minkowski
the distance for an Lp-space, takes p
as
an additional argument in ...
.
Manhattan
equals Lp
with p = 1
.
supremum, max, maximum
equals Lp
with p = Inf
.
The supremal pointwise difference between the curves.
and ...
all other available measures for dist
.
shortEuclidean
Euclidean distance on a limited part of the domain.
Additional arguments dmin
and dmax
can be specified in
...
, giving
the position of the first and the last point to use of an evenly spaced
sequence from 0
to 1
of length length(grid)
.
The default values are dmin = o
and dmax = 1
,
which results in the Euclidean distance on the entire domain.
mean
the absolute similarity of the overall mean values of the observations.
relAreas
the difference of the relation of two areas on parts
of the domain given by dmin1
to dmax1
and dmin2
to
dmax2
. They are defined analogously to dmin
and dmax
and take the same default values.
jump
the similarity of jump heights at points t1
and t2
,
i.e. x[t1 * length(x)] - x[t2 * length(x)]
for every functional observation x
.
The points t1
and t2
are the positions in an evenly spaced sequence
from 0
to 1
of length length(grid)
for which to compare the
jump height. The default values are t1 = 0
and t2 = 1
.
globMax
the difference of the curves global maxima.
globMin
the difference of the curves global minima.
points
the mean absolute differences at certain observation
points .poi
, also called "points of impact". These are specified as
a vector .poi
of arbitrary length with values between 0
and 1
, encoding the the index of the points of observations.
The default value is .poi = seq(0, 1, length.out = length(grid))
, which results in the Manhattan
distance.
custom.metric
your own semimetric will be used. Specify your
own distance function in the argument custom.metric
.
amplitudeDistance,phaseDistance
The amplitude distance or phase distance as described in Srivastava, A. and E. P. Klassen (2016). Functional and Shape Data Analysis. Springer.
FisherRao, elasticMetric
the elastic distance of the square root velocity of the curves as described in Srivastava and Klassen (2016). This equates to the Fisher Rao metric.
elasticDistance
weighted mean of the amplitude and the phase
distance using the implementation in elastic.distance
.
Additional arguments are the numeric the penalization parameters a,b,c
for the amplitude distance (a^2
) and the phase distance (b^2
).
The default values are a = 1/2, b = 1
.
Alternatively c
denotes the ratio of 2*a
and b
.
lambda
is the additional penalization parameter for the warping
allowed before calculating the elastic distance. The default is 1.
rucrdtw, rucred
Dynamic Time Warping Distance and Euclidean Distance
from package rucrdtw
. Implemented in Boersch-Supan (2016) and
originally described in Rakthanmanon et al. (2012).
[integer(1)
]
encode the indices used to define subspaces for
method %in% c("shortEuclidean", "relAreas")
as numeric values between 0 and 1, where 0 encodes grid[1]
and
1 encodes grid[length(grid)]
.
[numeric(1)
]
encode the position of the points for which to compare the jump heights in
method = "jump"
as numeric values between 0 and 1, see dmin
.
[numeric(1 to ncol(x))
]
numeric vector of length arbitrary length taking numeric values
between 0 and 1, denoting the
position of the points of interest for method = "points"
.
The default value is .poi = seq(0, 1, length.out = length(grid))
,
which results in the Manhattan distance.
[function(x, y, ...)
]
a function specifying how to compute the distance between
two functional observations (= numeric vectors of the same length)
x
and y
. It can handle additional arguments in ...
.
The default is the Euclidean distance (equals Minkwoski distance
with lp = 2
). Used for method = "custom.metric"
.
[numeric(1)
]
weights of the amplitude distance (a
) and the phase distance (b
)
in a semimetric that combines them by addition.
Used for method == 'elasticDistance'
.
[numeric(1)
]
penalization parameter for the warping allowed before calculating the
elastic distance.
Default value is 0. Large values imply less (no) warping, small values
imply more warping.
Used for method %in% c('elastic', 'SRV')
.
additional parameters to the (semi-)metrics.
a matrix of dimensions nrow(x)
by nrow(y)
containing the
distances of the functional observations contained in x
and y
,
if y
is specified. Otherwise a matrix containing the distances of all
functional observations within x
to each other.
Boersch-Supan (2016). rucrdtw: Fast time series subsequence search in R. The Journal of Open Source Software URL http://doi.org/10.21105/joss.00100
Fuchs, K., J. Gertheiss, and G. Tutz (2015): Nearest neighbor ensembles for functional data with interpretable feature selection. Chemometrics and Intelligent Laboratory Systems 146, 186 - 197.
Rakthanmanon, Thanawin, et al. "Searching and mining trillions of time series subsequences under dynamic time warping." Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2012.
Srivastava, A. and E. P. Klassen (2016). Functional and Shape Data Analysis. Springer.