modelCast: Forecasting Function for Trend-Stationary Time Series

Description

Point forecasts and the respective forecasting intervals for trend-stationary time series are calculated.

Usage

modelCast(
  obj,
  p = NULL,
  q = NULL,
  h = 1,
  method = c("norm", "boot"),
  alpha = 0.95,
  it = 10000,
  n.start = 1000,
  pb = TRUE,
  cores = future::availableCores(),
  np.fcast = c("lin", "const"),
  export.error = FALSE,
  plot = FALSE,
  ...
)

Value

The function returns a $3$ by $h$ matrix with its columns representing the future time points and the point forecasts, the lower bounds of the forecasting intervals and the upper bounds of the forecasting intervals as the rows. If the argument plot is set to TRUE, a plot of the forecasting results is created.

#'If export.error = TRUE is selected, a list with the following elements is returned instead.

fcast: the $3$ by $h$ forecasting matrix with point forecasts and bounds of the forecasting intervals.

error

an it by $h$ matrix, where each column represents a future time point $n + 1, n + 2, ..., n + h$; in each column the respective it simulated forecasting errors are saved.

Arguments

obj: an object of class smoots; must be the output of a trend estimation process and not of a first or second derivative estimation process.
p: an integer value $>= 0$ that defines the AR order $p$ of the underlying ARMA($p,q$) model within the rest term (see the section Details for more information); is set to NULL by default; if no value is passed to p but one is passed to q, p is set to 0; if both p and q are NULL, optimal orders following the BIC for $0 \leq p,q \leq 5$ are chosen; is set to NULL by default; decimal numbers will be rounded off to integers.
q: an integer value $\geq 0$ that defines the MA order $q$ of the underlying ARMA($p,q$) model within X; is set to NULL by default; if no value is passed to q but one is passed to p, q is set to 0; if both p and q are NULL, optimal orders following the BIC for $0 \leq p,q \leq 5$ are chosen; is set to NULL by default; decimal numbers will be rounded off to integers.
h: an integer that represents the forecasting horizon; if $n$ is the number of observations, point forecasts and forecasting intervals will be obtained for the time points $n + 1$ to $n + h$; is set to h = 1 by default; decimal numbers will be rounded off to integers.
method: a character object; defines the method used for the calculation of the forecasting intervals; with "norm" the intervals are obtained under the assumption of normally distributed innovations; with "boot" the intervals are obtained via a bootstrap; is set to "norm" by default.
alpha: a numeric vector of length 1 with $0 < $ alpha $ < 1$; the forecasting intervals will be obtained based on the confidence level ($100$alpha)-percent; is set to alpha = 0.95 by default, i.e., a $95$-percent confidence level.
it: an integer that represents the total number of iterations, i.e., the number of simulated series; is set to 10000 by default; only necessary, if method = "boot"; decimal numbers will be rounded off to integers.
n.start: an integer that defines the 'burn-in' number of observations for the simulated ARMA series via bootstrap; is set to 1000 by default; only necessary, if method = "boot";decimal numbers will be rounded off to integers.
pb: a logical value; for pb = TRUE, a progress bar will be shown in the console, if method = "boot".
cores: an integer value >0 that states the number of (logical) cores to use in the bootstrap (or NULL); the default is the maximum number of available cores (via future::availableCores); for cores = NULL, parallel computation is disabled.
np.fcast: a character object; defines the forecasting method used for the nonparametric trend; for np.fcast = "lin" the trend is is extrapolated linearly based on the last two trend estimates; for np.fcast = "const", the last trend estimate is used as a constant estimate for future values; is set to "lin" by default.
export.error: a single logical value; if the argument is set to TRUE and if also method = "boot", a list is returned instead of a matrix (FALSE); the first element of the list is the usual forecasting matrix whereas the second element is a matrix with h columns, where each column represents the calculated forecasting errors for the respective future time point $n + 1, n + 2, ..., n + h$; is set to FALSE by default.
plot: a logical value that controls the graphical output; for plot = TRUE, the original series with the obtained point forecasts as well as the forecasting intervals will be plotted; for the default plot = FALSE, no plot will be created.
...: additional arguments for the standard plot function, e.g., xlim, type, ... ; arguments with respect to plotted graphs, e.g., the argument col, only affect the original series X; please note that in accordance with the argument x (lower case) of the standard plot function, an additional numeric vector with time points can be implemented via the argument x (lower case). x should be valid for the sample observations only, i.e. length(x) == length(obj$orig) should be TRUE, as future time points will be calculated automatically.

Author

Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.uni-paderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn University),
Package Creator and Maintainer

Details

This function is part of the smoots package and was implemented under version 1.1.0. The point forecasts and forecasting intervals are obtained based on the additive nonparametric regression model $$y_t = m(x_t) + \epsilon_t,$$ where $y_t$ is the observed time series with equidistant design, $x_t$ is the rescaled time on the interval $[0, 1]$, $m(x_t)$ is a smooth trend function and $\epsilon_t$ are stationary errors with $E(\epsilon_t) = 0$ and short-range dependence (see also Beran and Feng, 2002). Thus, we assume $y_t$ to be a trend-stationary time series. Furthermore, we assume that the rest term $\epsilon_t$ follows an ARMA($p,q$) model $$\epsilon_t = \zeta_t + \beta_1 \epsilon_{t-1} + ... + \beta_p \epsilon_{t-p} + \alpha_1 \zeta_{t-1} + ... + \alpha_q \zeta_{t-q},$$ where $\alpha_j$, $j = 1, 2, ..., q$, and $\beta_i$, $i = 1, 2, ..., p$, are real numbers and the random variables $\zeta_t$ are i.i.d. (identically and independently distributed) with zero mean and constant variance.

The point forecasts and forecasting intervals for the future periods $n + 1, n + 2, ..., n + h$ will be obtained. With respect to the point forecasts of $\epsilon_t$, i.e., $\hat{\epsilon}_{n+k}$, where $k = 1, 2, ..., h$, $$\hat{\epsilon}_{n+k} = \sum_{i=1}^{p} \hat{\beta}_i \epsilon_{n+k-i} + \sum_{j=1}^{q} \hat{\alpha}_j \hat{\zeta}_{n+k-j}$$ with $\epsilon_{n+k-i} = \hat{\epsilon}_{n+k-i}$ for $n+k-i > n$ and $\hat{\zeta}_{n+k-j} = E(\zeta_t) = 0$ for $n+k-j > n$ will be applied. In practice, this procedure will not be applied directly to $\epsilon_t$ but to $y_t - \hat{m}(x_t)$.

The point forecasts of the nonparametric trend are simply obtained following the proposal by Fritz et al. (forthcoming) by $$\hat{m}(x_{n+k}) = \hat{m}(x_n) + Dk(\hat{m}(x_n) - \hat{m}(x_{n-1})),$$ where $D$ is a dummy variable that is either equal to the constant value $1$ or $0$. Consequently, if $D = 0$, $\hat{m}(x_{n})$, i.e., the last trend estimate, is used as a constant estimate for the future. However, if $D = 1$, the trend is extrapolated linearly. The point forecast for the whole component model is then given by $$\hat{y}_{n+k} = \hat{m}(x_{n+k}) + \hat{\epsilon}_{n+k},$$ i.e., it is equal to the sum of the point forecasts of the individual components.

Equivalently to the point forecasts, the forecasting intervals are the sum of the forecasting intervals of the individual components. To simplify the process, the forecasting error in $\hat{m}(x_{n+k})$, which is of order $O(-2/5)$, is not considered (see Fritz et al. (forthcoming)), i.e., only the forecasting intervals with respect to the rest term $\epsilon_t$ will be calculated.

If the distribution of the innovations is non-normal or generally not further specified, bootstrapping the forecasting intervals is recommended. If they are however normally distributed or if it is at least assumed that they are, the forecasting errors are also approximately normally distributed with a quickly obtainable variance. For further details on the bootstrapping method, we refer the readers to bootCast, whereas more information on the calculation under normality can be found at normCast.

In order to apply the function, a smoots object that was generated as the result of a trend estimation process needs to be passed to the argument obj. The arguments p and q represent the orders of the of the ARMA($p,q$) model that the error term $\epsilon_t$ is assumed to follow. If both arguments are set to NULL, which is the default setting, orders will be selected according to the Bayesian Information Criterion (BIC) for all possible combinations of $p,q = 0, 1, ..., 5$. Furthermore, the forecasting horizon can be adjusted by means of the argument h, so that point forecasts and forecasting intervals will be obtained for all time points $n + 1, n + 2, ..., n + h$.

The function also allows for two calculation approaches for the forecasting intervals. Via the argument method, intervals can be obtained under the assumption that the ARMA innovations are normally distributed (method = "norm"). Alternatively, bootstrapped intervals can be obtained for unknown innovation distributions that are clearly non-Gaussian (method = "boot").

Another argument is alpha. By passing a value to this argument, the ($100$alpha)-percent confidence level for the forecasting intervals can be defined. If method = "boot" is selected, the additional arguments it and n.start can be adjusted. More specifically, it regulates the number of iterations of the bootstrap, whereas n.start sets the number of 'burn-in' observations in the simulated ARMA processes within the bootstrap that are omitted.

Since this bootstrap approach for method = "boot" generally needs a lot of computation time, especially for series with high numbers of observations and when fitting models with many parameters, parallel computation of the bootstrap iterations is enabled. With cores, the number of cores can be defined with an integer. Nonetheless, for cores = NULL, no cluster is created and therefore the parallel computation is disabled. Note that the bootstrapped results are fully reproducible for all cluster sizes. The progress of the bootstrap can be observed in the R console, where a progress bar and the estimated remaining time are displayed for pb = TRUE.

Moreover, the argument np.fcast allows to set the forecasting method for the nonparametric trend function. As previously discussed, the two options are a linear extrapolation of the trend (np.fcast = "lin") and a constant continuation of the last estimated value of the trend (np.fcast = "const").

The function also implements the option to automatically create a plot of the forecasting results for plot = TRUE. This includes the feature to pass additional arguments of the standard plot function to modelCast (see also the section 'Examples').

NOTE:

Within this function, the arima function of the stats package with its method "CSS-ML" is used throughout for the estimation of ARMA models. Furthermore, to increase the performance, C++ code via the Rcpp and RcppArmadillo packages was implemented. Also, the future and future.apply packages are considered for parallel computation of bootstrap iterations. The progress of the bootstrap is shown via the progressr package.

References

Beran, J. and Feng, Y. (2002). Local polynomial fitting with long-memory, short-memory and antipersistent errors. Annals of the Institute of Statistical Mathematics, 54(2), 291-311.

Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.

Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.

Feng, Y., Gries, T., Fritz, M., Letmathe, S. and Schulz, D. (2020). Diagnosing the trend and bootstrapping the forecasting intervals using a semiparametric ARMA. Discussion Paper. Paderborn University. Unpublished.

Fritz, M., Forstinger, S., Feng, Y., and Gries, T. (forthcoming). Forecasting economic growth processes for developing economies. Unpublished.

Examples

Run this code

# \donttest{
X <- log(smoots::gdpUS$GDP)
NPest <- smoots::msmooth(X)
modelCast(NPest, h = 5, plot = TRUE, xlim = c(261, 295), type = "b",
 col = "deepskyblue4", lty = 3, pch = 20, main = "Exemplary title")
# }

Run the code above in your browser using DataLab