Learn R Programming

README

Francisco Bischoff

  • 18 Aug 2022

Time Series with Matrix Profile

BuildDev
Windows
Coverage

Notice

This version is being maintained to keep up with CRAN standards. As soon as possible a new version (with possible breaking changes) with less dependencies will be released later in 2022 or beginning of 2023.

Overview

R Functions implementing UCR Matrix Profile Algorithm (http://www.cs.ucr.edu/~eamonn/MatrixProfile.html).

This package allows you to use the Matrix Profile concept as a toolkit.

This package provides:

  • Algorithms to build a Matrix Profile: STAMP, STOMP, SCRIMP++, SIMPLE, MSTOMP and VALMOD.
  • Algorithms for MOTIF search for Unidimensional and Multidimensional Matrix Profiles.
  • Algorithm for Chains search for Unidimensional Matrix Profile.
  • Algorithms for Semantic Segmentation (FLUSS) and Weakly Labeled data (SDTS).
  • Algorithm for Salient Subsections detection allowing MDS plotting.
  • Basic plotting for all outputs generated here.
  • Sequencial workflow, see below.
# Basic workflow:
matrix <- tsmp(data, window_size = 30) %>%
  find_motif(n_motifs = 3) %T>%
  plot()

# SDTS still have a unique way to work:
model <- sdts_train(data, labels, windows)
result <- sdts_predict(model, data, round(mean(windows)))

Please refer to the User Manual for more details.

Please be welcome to suggest improvements.

Performance on an Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz using a random walk dataset

set.seed(2018)
data <- cumsum(sample(c(-1, 1), 40000, TRUE))

Current version benchmark

WIP in this version

Installation

# Install the released version from CRAN
install.packages("tsmp")

# Or the development version from GitHub:
# install.packages("devtools")
devtools::install_github("matrix-profile-foundation/tsmp")

Currently available Features

  • STAMP (single and multi-thread versions)
  • STOMP (single and multi-thread versions)
  • STOMPi (On-line version)
  • SCRIMP (single-thread, not for AB-joins yet)
  • Time Series Chains
  • Multivariate STOMP (mSTOMP)
  • Multivariate MOTIF Search (from mSTOMP)
  • Salient Subsequences search for Multidimensional Space
  • Scalable Dictionary learning for Time Series (SDTS) prediction
  • FLUSS (Fast Low-cost Unipotent Semantic Segmentation)
  • FLOSS (Fast Low-cost On-line Unipotent Semantic Segmentation)
  • SiMPle-Fast (Fast Similarity Matrix Profile for Music Analysis and Exploration)
  • Annotation vectors (e.g., Stop-word MOTIF bias, Actionability bias)
  • FLUSS Arc Plot and SiMPle Arc Plot
  • Exact Detection of Variable Length Motifs (VALMOD)
  • MPdist: Matrix Profile Distance
  • Time Series Snippets
  • Subsetting Matrix Profiles (head(), tail(), [, etc.)
  • Misc:
    • MASS v2.0
    • MASS v3.0
    • MASS extensions: ADP (Approximate Distance Profile, with PAA)
    • MASS extensions: WQ (Weighted Query)
    • MASS extensions: QwG (Query with Gap)
    • Fast moving average
    • Fast moving SD

Roadmap

  • Profile-Based Shapelet Discovery
  • GPU-STOMP

Other projects with Matrix Profile

Matrix Profile Foundation

Our next step unifying the Matrix Profile implementation in several programming languages.

Visit: Matrix Profile Foundation

Package dependencies

Code of Conduct

Please note that the ‘tsmp’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Copy Link

Version

Install

install.packages('tsmp')

Monthly Downloads

360

Version

0.4.15

License

Apache License (>= 2.0)

Issues

Pull Requests

Stars

Forks

Maintainer

Francisco Bischoff

Last Published

August 20th, 2022

Functions in tsmp (0.4.15)

floss

Fast Low-cost Online Semantic Segmentation (FLOSS)
fast_movsd

Fast implementation of moving standard deviation
find_snippet

Time Series Snippets: A New Primitive for Time Series Data Mining
find_chains

Find Time Series Chains
floss_cac

FLOSS - Corrected Arc Counts
fast_avg_sd

Fast implementation of moving average and moving standard deviation
find_motif

Search for Motifs
dist_profile

Calculates the distance profile using MASS algorithms
find_discord

Search for Discord
fast_movavg

Fast implementation of moving average
fluss_score

FLUSS - Prediction score calculation
mass_pre_w

Precomputes several values used on MASS
mass-deprecated

Calculates the distance profile using MASS_V2 algorithm
mass_pre

Precomputes several values used on MASS
fluss_extract

FLUSS - Extract Segments
fluss_cac

FLUSS - Corrected Arc Counts
get_data

Get the data included in a TSMP object, if any.
floss_extract

FLOSS - Extract Segments
fluss

Fast Low-cost Unipotent Semantic Segmentation (FLUSS)
mass_v2

Calculates the distance profile using MASS_V2 algorithm
motifs

Search for Motifs
mp_meat_data

Original data used in the Salient Subsequences demo
min_mp_idx

Get index of the minimum value from a matrix profile and its nearest neighbor
mass_v3

Calculates the distance profile using MASS_V3 algorithm
motifs_discords_small

Just a synthetic dataset for testing
mp_gait_data

Original data used in the Time Series Chain demo
mp_fluss_data

Original data used in the FLUSS paper
mass_weighted

Calculates the distance profile using MASS_WEIGHTED algorithm
mp_toy_data

Original data used in the mSTAMP demo
mp_test_data

Original data used in the STDS demo
plot

Plot a TSMP object
mstomp_par

Multivariate STOMP algorithm Parallel version
remove_class

Remove a TSMP class from an object
read

Read TSMP object from JSON file.
pmp

Pan-Matrix Profile
mpdist

MPdist - Distance between Time Series using Matrix Profile
mpx

Fast implementation of MP and MPI for internal purposes, without FFT
pmp_upper_bound

Pan Matrix Profile upper bound
%>%

Pipe operator
plot_arcs

Plot arcs between indexes of a Profile Index
salient_mds

Convert salient sequences into MDS space
salient_subsequences

Framework for retrieve salient subsequences from a dataset
salient_score

Computes the F-Score of salient algorithm.
scrimp

Anytime univariate SCRIMP++ algorithm
set_data

Set/changes the data included in TSMP object.
simple_fast

Compute the join similarity for Sound data
sdts_predict

Framework for Scalable Dictionary learning for Time Series (SDTS) prediction function
stamp_par

Anytime univariate STAMP algorithm Parallel version
sdts_train

Framework for Scalable Dictionary learning for Time Series (SDTS) training function
sdts_score

Computes the F-Score of a SDTS prediction
stomp_par

Univariate STOMP algorithm
write

Write a TSMP object to JSON file.
mass

Deprecated functions in package tsmp.
visualize

Plots an object generated from one of the algorithms. In some cases multiple plots will be generated
tsmp

Computes the Matrix Profile and Profile Index
valmod

Variable Length Motif Discovery
stompi_update

Real-time STOMP algorithm
av_complexity

Computes the annotation vector that favors complexity
av_hardlimit_artifact

Computes the annotation vector that suppresses hard-limited artifacts
analyze

Runs an appropriate workflow based on the parameters passed in.
discords

Search for Discord
compute

Computes the Matrix Profile or Pan-Matrix Profile
av_motion_artifact

Computes the annotation vector that suppresses motion artifacts
av_apply

Corrects the matrix profile using an annotation vector
as.matrixprofile

Convert a TSMP object into another if possible
av_stop_word

Computes the annotation vector that suppresses stop-word motifs
av_zerocrossing

Computes the annotation vector that favors number of zero crossing