Learn R Programming

dtwclust (version 2.1.2)

SBD: Shape-based distance

Description

Distance based on coefficient-normalized cross-correlation as proposed by Paparrizos and Gravano, 2015, for the k-Shape clustering algorithm.

Usage

SBD(x, y, znorm = FALSE)

Arguments

x, y
A time series.
znorm
Logical. Should each series be z-normalized before calculating the distance?

Value

A list with:
  • dist: The shape-based distance between x and y.
  • yshift: A shifted version of y so that it optimally matches x.

Details

This distance works best if the series are z-normalized. If not, at least they should have corresponding amplitudes, since the values of the signals do affect the outcome.

If x and y do not have the same length, it would be best if the longer sequence is provided in y, because it will be shifted to match x. Anything before the matching point is discarded and the series is padded with trailing zeros as needed.

The output values lie between 0 and 2, with 0 indicating perfect similarity.

References

Paparrizos J and Gravano L (2015). ``k-Shape: Efficient and Accurate Clustering of Time Series.'' In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, series SIGMOD '15, pp. 1855-1870. ISBN 978-1-4503-2758-9, http://doi.org/10.1145/2723372.2737793.

See Also

NCCc, shape_extraction

Examples

Run this code

# load data
data(uciCT)

# distance between series of different lengths
sbd <- SBD(CharTraj[[1]], CharTraj[[100]], znorm = TRUE)$dist

# cross-distance matrix for series subset (notice the two-list input)
sbD <- proxy::dist(CharTraj[1:10], CharTraj[1:10], method = "SBD", znorm = TRUE)

Run the code above in your browser using DataLab