Learn R Programming

⚠️There's a newer version (2.9.0) of this package.Take me there.

timetk

Mission

To make it easy to visualize, wrangle, and feature engineer time series data for forecasting and machine learning prediction.

Installation

Download the development version with latest features:

remotes::install_github("business-science/timetk")

Or, download CRAN approved version:

install.packages("timetk")

Getting Started

Package Functionality

There are many R packages for working with Time Series data. Here’s how timetk compares to the “tidy” time series R packages for data visualization, wrangling, and feature engineeering (those that leverage data frames or tibbles).

Tasktimetktsibblefeaststibbletime
Structure
Data Structuretibble (tbl)tsibble (tbl_ts)tsibble (tbl_ts)tibbletime (tbl_time)
Visualization
Interactive Plots (plotly):x::x::x:
Static Plots (ggplot):x::x:
Time Series:x::x:
Correlation, Seasonality:x::x:
Anomaly Detection:x::x::x:
Data Wrangling
Time-Based Summarization:x::x:
Time-Based Filtering:x::x:
Padding Gaps:x::x:
Low to High Frequency:x::x::x:
Imputation:x::x:
Sliding / Rolling:x:
Feature Engineering (recipes)
Date Feature Engineering:x::x::x:
Holiday Feature Engineering:x::x::x:
Fourier Series:x::x::x:
Smoothing & Rolling:x::x::x:
Padding:x::x::x:
Imputation:x::x::x:
Cross Validation (rsample)
Time Series Cross Validation:x::x::x:
Time Series CV Plan Visualization:x::x::x:
More Awesomeness
Making Time Series (Intelligently):x:
Handling Holidays & Weekends:x::x::x:
Class Conversion:x::x:
Automatic Frequency & Trend:x::x::x:

What can you do in 1 line of code?

Investigate a time series…

taylor_30_min %>%
    plot_time_series(date, value, .color_var = week(date), 
                     .interactive = FALSE, .color_lab = "Week")

Visualize anomalies…

walmart_sales_weekly %>%
    group_by(Store, Dept) %>%
    plot_anomaly_diagnostics(Date, Weekly_Sales, 
                             .facet_ncol = 3, .interactive = FALSE)

Make a seasonality plot…

taylor_30_min %>%
    plot_seasonal_diagnostics(date, value, .interactive = FALSE)

Inspect autocorrelation, partial autocorrelation (and cross correlations too)…

taylor_30_min %>%
    plot_acf_diagnostics(date, value, .lags = "1 week", .interactive = FALSE)

Acknowledgements

The timetk package wouldn’t be possible without other amazing time series packages.

  • stats - Basically every timetk function that uses a period (frequency) argument owes it to ts().
    • plot_acf_diagnostics(): Leverages stats::acf(), stats::pacf() & stats::ccf()
    • plot_stl_diagnostics(): Leverages stats::stl()
  • lubridate: timetk makes heavy use of floor_date(), ceiling_date(), and duration() for “time-based phrases”.
    • Add and Subtract Time (%+time% & %-time%): "2012-01-01" %+time% "1 month 4 days" uses lubridate to intelligently offset the day
  • xts: Used to calculate periodicity and fast lag automation.
  • forecast (retired): Possibly my favorite R package of all time. It’s based on ts, and it’s predecessor is the tidyverts (fable, tsibble, feasts, and fabletools).
    • The ts_impute_vec() function for low-level vectorized imputation using STL + Linear Interpolation uses na.interp() under the hood.
    • The ts_clean_vec() function for low-level vectorized imputation using STL + Linear Interpolation uses tsclean() under the hood.
    • Box Cox transformation auto_lambda() uses BoxCox.Lambda().
  • tibbletime (retired): While timetk does not import tibbletime, it uses much of the innovative functionality to interpret time-based phrases:
    • tk_make_timeseries() - Extends seq.Date() and seq.POSIXt() using a simple phase like “2012-02” to populate the entire time series from start to finish in February 2012.
    • filter_by_time(), between_time() - Uses innovative endpoint detection from phrases like “2012”
    • slidify() is basically rollify() using slider (see below).
  • slider: A powerful R package that provides a purrr-syntax for complex rolling (sliding) calculations.
    • slidify() uses slider::pslide under the hood.
    • slidify_vec() uses slider::slide_vec() for simple vectorized rolls (slides).
  • padr: Used for padding time series from low frequency to high frequency and filling in gaps.
    • The pad_by_time() function is a wrapper for padr::pad().
    • See the step_ts_pad() to apply padding as a preprocessing recipe!
  • TSstudio: This is the best interactive time series visualization tool out there. It leverages the ts system, which is the same system the forecast R package uses. A ton of inspiration for visuals came from using TSstudio.

Learning More

My Talk on High-Performance Time Series Forecasting

Time series is changing. Businesses now need 10,000+ time series forecasts every day. This is what I call a High-Performance Time Series Forecasting System (HPTSF) - Accurate, Robust, and Scalable Forecasting.

High-Performance Forecasting Systems will save companies MILLIONS of dollars. Imagine what will happen to your career if you can provide your organization a “High-Performance Time Series Forecasting System” (HPTSF System).

I teach how to build a HPTFS System in my High-Performance Time Series Forecasting Course. If interested in learning Scalable High-Performance Forecasting Strategies then take my course. You will learn:

  • Time Series Machine Learning (cutting-edge) with Modeltime - 30+ Models (Prophet, ARIMA, XGBoost, Random Forest, & many more)
  • NEW - Deep Learning with GluonTS (Competition Winners)
  • Time Series Preprocessing, Noise Reduction, & Anomaly Detection
  • Feature engineering using lagged variables & external regressors
  • Hyperparameter Tuning
  • Time series cross-validation
  • Ensembling Multiple Machine Learning & Univariate Modeling Techniques (Competition Winner)
  • Scalable Forecasting - Forecast 1000+ time series in parallel
  • and more.

Unlock the High-Performance Time Series Forecasting Course

Copy Link

Version

Install

install.packages('timetk')

Monthly Downloads

44,212

Version

2.6.1

License

GPL (>= 3)

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

January 18th, 2021

Functions in timetk (2.6.1)

future_frame

Make future time series from existing
is_date_class

Check if an object is a date class
condense_period

Convert the Period to a Lower Periodicity (e.g. Go from Daily to Monthly)
fourier_vec

Fourier Series
filter_period

Apply filtering expressions inside periods (windows)
filter_by_time

Filter (for Time-Series Data)
diff_vec

Differencing Transformation
box_cox_vec

Box Cox Transformation
between_time

Between (For Time Series): Range detection for date or date-time sequences
mutate_by_time

Mutate (for Time Series Data)
m4_monthly

Sample of 4 Monthly Time Series Datasets from the M4 Competition
plot_acf_diagnostics

Visualize the ACF, PACF, and CCFs for One or More Time Series
standardize_vec

Standardize to Mean 0, Standard Deviation 1 (Center & Scale)
step_box_cox

Box-Cox Transformation using Forecast Methods
normalize_vec

Normalize to Range (0, 1)
plot_anomaly_diagnostics

Visualize Anomalies for One or More Time Series
m4_yearly

Sample of 4 Yearly Time Series Datasets from the M4 Competition
log_interval_vec

Log-Interval Transformation for Constrained Interval Forecasting
plot_time_series

Interactive Plotting for One or More Time Series
lag_vec

Lag Transformation
m4_weekly

Sample of 4 Weekly Time Series Datasets from the M4 Competition
m4_quarterly

Sample of 4 Quarterly Time Series Datasets from the M4 Competition
bike_sharing_daily

Daily Bike Sharing Data
step_timeseries_signature

Time Series Feature (Signature) Generator
tk_acf_diagnostics

Group-wise ACF, PACF, and CCF Data Preparation
tk_ts_.data.frame

Internal Functions Used in timetk
step_smooth

Smoothing Transformation using Loess
pad_by_time

Insert time series rows with regularly spaced timestamps
slidify_vec

Rolling Window Transformation
parse_date2

Fast, flexible date and datetime parsing
plot_time_series_regression

Visualize a Time Series Linear Regression Formula
required_pkgs.step_box_cox

S3 methods for tracking which additional packages are needed for steps.
plot_time_series_cv_plan

Visualize a Time Series Resample Plan
tk_augment_lags

Add many lags to the data
taylor_30_min

Half-hourly electricity demand
tk_augment_fourier

Add many fourier series to the data
tidyeval

Tidy eval helpers
tk_augment_holiday

Add many holiday features to the data
slice_period

Apply slice inside periods (windows)
slidify

Create a rolling (sliding) version of any function
step_ts_clean

Clean Outliers and Missing Data for Time Series
step_ts_impute

Missing Data Imputation for Time Series
time_arithmetic

Add / Subtract (For Time Series)
step_diff

Create a differenced predictor
step_fourier

Fourier Features for Modeling Seasonality
tk_augment_slidify

Add many rolling window calculations to the data
tk_index

Extract an index of date or datetime from time series objects, models, forecasts
m4_daily

Sample of 4 Daily Time Series Datasets from the M4 Competition
smooth_vec

Smoothing Transformation using Loess
tk_summary_diagnostics

Group-wise Time Series Summary
walmart_sales_weekly

Sample Time Series Retail Data from the Walmart Recruiting Store Sales Forecasting Competition
tk_make_future_timeseries

Make future time series from existing
tk_tbl

Coerce time-series objects to tibble.
step_slidify

Slidify Rolling Window Transformation
tk_xts

Coerce time series objects and tibbles with date/date-time columns to xts.
step_slidify_augment

Slidify Rolling Window Transformation (Augmented Version)
tk_ts

Coerce time series objects and tibbles with date/date-time columns to ts.
plot_seasonal_diagnostics

Visualize Multiple Seasonality Features for One or More Time Series
step_ts_pad

Pad: Add rows to fill gaps and go from low to high frequency
time_series_cv

Time Series Cross Validation
plot_stl_diagnostics

Visualize STL Decomposition Features for One or More Time Series
tk_anomaly_diagnostics

Automatic group-wise Anomaly Detection by STL Decomposition
m4_hourly

Sample of 4 Hourly Time Series Datasets from the M4 Competition
step_holiday_signature

Holiday Feature (Signature) Generator
set_tk_time_scale_template

Get and modify the Time Scale Template
step_log_interval

Log Interval Transformation for Constrained Interval Forecasting
tk_augment_timeseries

Add many time series features to the data
tk_get_frequency

Automatic frequency and trend calculation from a time series index
tk_augment_differences

Add many differenced columns to the data
tk_zoo

Coerce time series objects and tibbles with date/date-time columns to xts.
tk_zooreg

Coerce time series objects and tibbles with date/date-time columns to ts.
tk_make_holiday_sequence

Make daily Holiday and Weekend date sequences
tk_make_timeseries

Intelligent date and date-time sequence creation
tk_get_timeseries_unit_frequency

Get the timeseries unit frequency for the primary time scales
tk_get_timeseries_variables

Get date or datetime variables (column names)
time_series_split

Simple Training/Test Set Splitting for Time Series
summarise_by_time

Summarise (for Time Series Data)
tk_time_series_cv_plan

Time Series Resample Plan Data Preparation
ts_clean_vec

Replace Outliers & Missing Values in a Time Series
wikipedia_traffic_daily

Sample Daily Time Series Data from the Web Traffic Forecasting (Wikipedia) Competition
ts_impute_vec

Missing Value Imputation for Time Series
timetk

timetk: a toolkit for time series
tk_get_holiday

Get holiday features from a time-series index
tk_get_timeseries

Get date features from a time-series index
tk_stl_diagnostics

Group-wise STL Decomposition (Season, Trend, Remainder)
tk_seasonal_diagnostics

Group-wise Seasonality Data Preparation