Learn R Programming

imputeTS (version 3.3)

na_seadec: Seasonally Decomposed Missing Value Imputation

Description

Removes the seasonal component from the time series, performs imputation on the deseasonalized series and afterwards adds the seasonal component again.

Usage

na_seadec(
  x,
  algorithm = "interpolation",
  find_frequency = FALSE,
  maxgap = Inf,
  ...
)

Value

Vector (vector) or Time Series (ts) object (dependent on given input at parameter x)

Arguments

x

Numeric Vector (vector) or Time Series (ts) object in which missing values shall be replaced

algorithm

Algorithm to be used after decomposition. Accepts the following input:

  • "interpolation" - Imputation by Interpolation (default choice)

  • "locf" - Imputation by Last Observation Carried Forward

  • "mean" - Imputation by Mean Value

  • "random" - Imputation by Random Sample

  • "kalman" - Imputation by Kalman Smoothing and State Space Models

  • "ma" - Imputation by Weighted Moving Average

find_frequency

If TRUE the algorithm will try to estimate the frequency of the time-series automatically.

maxgap

Maximum number of successive NAs to still perform imputation on. Default setting is to replace all NAs without restrictions. With this option set, consecutive NAs runs, that are longer than 'maxgap' will be left NA. This option mostly makes sense if you want to treat long runs of NA afterwards separately.

...

Additional parameters for these algorithms that can be passed through. Look at na_interpolation, na_locf, na_random, na_mean for parameter options.

Author

Steffen Moritz

Details

The algorithm first performs a Seasonal Decomposition of Time Series by Loess via stl. Decomposing the time series into seasonal, trend and irregular components. The seasonal component gets then removed (subtracted) from the original series. As a second step the selected imputation algorithm e.g. na_locf, na_ma, ... is applied on the deseasonalized series. Thus, the algorithm can work without being affected by seasonal patterns. After filling the NA gaps, the seasonal component is added to the deseasonalized series again.

Implementation details: A paper about the STL Decomposition procedure is linked in the references. Since the function only works with complete data, the initial NA data is temporarily filled via linear interpolation in order to perform the decomposition. These temporarily imputed values are replaced with NAs again after obtaining the decomposition for the non-NA observations. STL decomposition is run with robust = TRUE and s.window = 11. Additionally, applying STL decomposition needs a preset frequency. This can be passed by the frequency set in the input ts object or by setting 'find_frequency=TRUE' in order to find an appropriate frequency for the time series. The find_frequency parameter internally uses findfrequency, which does a spectral analysis of the time series for identifying a suitable frequency. Using find_frequency will update the previously set frequency of a ts object to the newly found frequency. The default is 'find_frequency = FALSE', which gives a warning if no seasonality is set for the supplied time series object. If neither seasonality is set nor find_frequency is set to TRUE, the function goes on without decomposition and just applies the selected secondary algorithm to the original time series that still includes seasonality.

References

R. B. Cleveland, W. S. Cleveland, J.E. McRae, and I. Terpenning (1990) STL: A Seasonal-Trend Decomposition Procedure Based on Loess. Journal of Official Statistics, 6, 3–73.

See Also

na_interpolation, na_kalman, na_locf, na_ma, na_mean, na_random, na_replace, na_seasplit

Examples

Run this code
# Example 1: Perform seasonal imputation using algorithm = "interpolation"
na_seadec(tsAirgap, algorithm = "interpolation")

# Example 2: Perform seasonal imputation using algorithm = "mean"
na_seadec(tsAirgap, algorithm = "mean")

# Example 3: Same as example 1, just written with pipe operator
tsAirgap %>% na_seadec(algorithm = "interpolation")

Run the code above in your browser using DataLab