Learn R Programming

timetk (version 2.8.1)

tk_seasonal_diagnostics: Group-wise Seasonality Data Preparation

Description

tk_seasonal_diagnostics() is the preprocessor for plot_seasonal_diagnostics(). It helps by automating feature collection for time series seasonality analysis.

Usage

tk_seasonal_diagnostics(.data, .date_var, .value, .feature_set = "auto")

Value

A tibble or data.frame with seasonal features

Arguments

.data

A tibble or data.frame with a time-based column

.date_var

A column containing either date or date-time values

.value

A column containing numeric values

.feature_set

One or multiple selections to analyze for seasonality. Choices include:

  • "auto" - Automatically selects features based on the time stamps and length of the series.

  • "second" - Good for analyzing seasonality by second of each minute.

  • "minute" - Good for analyzing seasonality by minute of the hour

  • "hour" - Good for analyzing seasonality by hour of the day

  • "wday.lbl" - Labeled weekdays. Good for analyzing seasonality by day of the week.

  • "week" - Good for analyzing seasonality by week of the year.

  • "month.lbl" - Labeled months. Good for analyzing seasonality by month of the year.

  • "quarter" - Good for analyzing seasonality by quarter of the year

  • "year" - Good for analyzing seasonality over multiple years.

Details

Automatic Feature Selection

Internal calculations are performed to detect a sub-range of features to include useing the following logic:

  • The minimum feature is selected based on the median difference between consecutive timestamps

  • The maximum feature is selected based on having 2 full periods.

Example: Hourly timestamp data that lasts more than 2 weeks will have the following features: "hour", "wday.lbl", and "week".

Scalable with Grouped Data Frames

This function respects grouped data.frame and tibbles that were made with dplyr::group_by().

For grouped data, the automatic feature selection returned is a collection of all features within the sub-groups. This means extra features are returned even though they may be meaningless for some of the groups.

Transformations

The .value parameter respects transformations (e.g. .value = log(sales)).

Examples

Run this code
library(dplyr)
library(timetk)

# ---- GROUPED EXAMPLES ----

# Hourly Data
m4_hourly %>%
    group_by(id) %>%
    tk_seasonal_diagnostics(date, value)

# Monthly Data
m4_monthly %>%
    group_by(id) %>%
    tk_seasonal_diagnostics(date, value)

# ---- TRANSFORMATION ----

m4_weekly %>%
    group_by(id) %>%
    tk_seasonal_diagnostics(date, log(value))

# ---- CUSTOM FEATURE SELECTION ----

m4_hourly %>%
    group_by(id) %>%
    tk_seasonal_diagnostics(date, value, .feature_set = c("hour", "week"))

Run the code above in your browser using DataLab