Removes the seasonal component from the time series, performs imputation on the deseasonalized series and afterwards adds the seasonal component again.
na_seadec(
x,
algorithm = "interpolation",
find_frequency = FALSE,
maxgap = Inf,
...
)
Vector (vector
) or Time Series (ts
)
object (dependent on given input at parameter x)
Numeric Vector (vector
) or Time Series (ts
)
object in which missing values shall be replaced
Algorithm to be used after decomposition. Accepts the following input:
"interpolation" - Imputation by Interpolation (default choice)
"locf" - Imputation by Last Observation Carried Forward
"mean" - Imputation by Mean Value
"random" - Imputation by Random Sample
"kalman" - Imputation by Kalman Smoothing and State Space Models
"ma" - Imputation by Weighted Moving Average
If TRUE the algorithm will try to estimate the frequency of the time-series automatically.
Maximum number of successive NAs to still perform imputation on. Default setting is to replace all NAs without restrictions. With this option set, consecutive NAs runs, that are longer than 'maxgap' will be left NA. This option mostly makes sense if you want to treat long runs of NA afterwards separately.
Additional parameters for these algorithms that can be passed
through. Look at na_interpolation
,
na_locf
, na_random
,
na_mean
for parameter options.
Steffen Moritz
The algorithm first performs a Seasonal Decomposition of Time Series by Loess
via stl
. Decomposing the time series into seasonal, trend and irregular
components. The seasonal component gets then removed (subtracted) from the original series.
As a second step the selected imputation algorithm e.g. na_locf, na_ma, ... is applied
on the deseasonalized series. Thus, the algorithm can work without being affected by seasonal
patterns. After filling the NA gaps, the seasonal component is added to the deseasonalized
series again.
Implementation details:
A paper about the STL Decomposition procedure is linked in the references.
Since the function only works with complete data, the initial NA data is temporarily filled
via linear interpolation in order to perform the decomposition. These temporarily imputed
values are replaced with NAs again after obtaining the decomposition for the non-NA
observations. STL decomposition is run with robust = TRUE and s.window = 11. Additionally,
applying STL decomposition needs a preset frequency. This can be passed by the frequency
set in the input ts object or by setting 'find_frequency=TRUE' in order to find
an appropriate frequency for the time series. The find_frequency parameter internally uses
findfrequency
, which does a spectral analysis of the time series
for identifying a suitable frequency. Using find_frequency will update the previously set
frequency of a ts object to the newly found frequency. The default is 'find_frequency = FALSE',
which gives a warning if no seasonality is set for the supplied time series object.
If neither seasonality is set nor find_frequency is set to TRUE, the function goes on without
decomposition and just applies the selected secondary algorithm to the original time series
that still includes seasonality.
R. B. Cleveland, W. S. Cleveland, J.E. McRae, and I. Terpenning (1990) STL: A Seasonal-Trend Decomposition Procedure Based on Loess. Journal of Official Statistics, 6, 3–73.
na_interpolation
,
na_kalman
, na_locf
,
na_ma
, na_mean
,
na_random
, na_replace
,
na_seasplit
# Example 1: Perform seasonal imputation using algorithm = "interpolation"
na_seadec(tsAirgap, algorithm = "interpolation")
# Example 2: Perform seasonal imputation using algorithm = "mean"
na_seadec(tsAirgap, algorithm = "mean")
# Example 3: Same as example 1, just written with pipe operator
tsAirgap %>% na_seadec(algorithm = "interpolation")
Run the code above in your browser using DataLab