fitGSMAR: Estimate Gaussian or Student's t Mixture Autoregressive model

Description

fitGSMAR estimates GMAR, StMAR, or G-StMAR model in two phases. In the first phase, a genetic algorithm is employed to find starting values for a gradient based method. In the second phase, the gradient based variable metric algorithm is utilized to accurately converge to a local maximum or a saddle point near each starting value. Parallel computing is used to conduct multiple rounds of estimations in parallel.

Usage

fitGSMAR(
  data,
  p,
  M,
  model = c("GMAR", "StMAR", "G-StMAR"),
  restricted = FALSE,
  constraints = NULL,
  conditional = TRUE,
  parametrization = c("intercept", "mean"),
  ncalls = round(10 + 9 * log(sum(M))),
  ncores = min(2, ncalls, parallel::detectCores()),
  maxit = 300,
  seeds = NULL,
  printRes = TRUE,
  runTests = FALSE,
  ...
)

Arguments

data

a numeric vector or class 'ts' object containing the data. NA values are not supported.

a positive integer specifying the autoregressive order of the model.

For GMAR and StMAR models:: a positive integer specifying the number of mixture components.
For G-StMAR models:: a size (2x1) integer vector specifying the number of GMAR type components M1 in the first element and StMAR type components M2 in the second element. The total number of mixture components is M=M1+M2.

model

is "GMAR", "StMAR", or "G-StMAR" model considered? In the G-StMAR model, the first M1 components are GMAR type and the rest M2 components are StMAR type.

restricted

a logical argument stating whether the AR coefficients $\phi_{m,1},...,\phi_{m,p}$ are restricted to be the same for all regimes.

constraints

specifies linear constraints applied to the autoregressive parameters.

For non-restricted models:: a list of size $(pxq_{m})$ constraint matrices $C_{m}$ of full column rank satisfying $\phi_{m}$$=$$C_{m}\psi_{m}$ for all $m=1,...,M$, where $\phi_{m}$$=(\phi_{m,1},...,\phi_{m,p})$ and $\psi_{m}$$=(\psi_{m,1},...,\psi_{m,q_{m}})$.
For restricted models:: a size $(pxq)$ constraint matrix $C$ of full column rank satisfying $\phi$$=$$C\psi$, where $\phi$$=(\phi_{1},...,\phi_{p})$ and $\psi$$=\psi_{1},...,\psi_{q}$.

Symbol $\phi$ denotes an AR coefficient. Note that regardless of any constraints, the nominal autoregressive order is always p for all regimes. Ignore or set to NULL if applying linear constraints is not desired.

conditional

a logical argument specifying whether the conditional or exact log-likelihood function should be used.

parametrization

is the model parametrized with the "intercepts" $\phi_{m,0}$ or "means" $\mu_m = \phi_{m,0}/(1-\sum\phi_{i,m})$?

ncalls

a positive integer specifying how many rounds of estimation should be conducted. The estimation results may vary from round to round because of multimodality of the log-likelihood function and the randomness associated with the genetic algorithm.

ncores

the number of CPU cores to be used in the estimation process.

maxit

the maximum number of iterations for the variable metric algorithm.

seeds

a length ncalls vector containing the random number generator seed for each call to the genetic algorithm, or NULL for not initializing the seed. Exists for the purpose of creating reproducible results.

printRes

should the estimation results be printed?

runTests

should quantile residuals tests be performed after the estimation?

...

additional settings passed to the function GAfit employing the genetic algorithm.

Value

Returns an object of class 'gsmar' defining the estimated GMAR, StMAR or G-StMAR model. The returned object contains estimated mixing weights, some conditional and unconditional moments, quantile residuals, and quantile residual test results if the tests were performed. Note that the first p observations are taken as the initial values so the mixing weights, conditional moments, and quantile residuals start from the p+1:th observation (interpreted as t=1).In addition, the returned object contains the estimates and log-likelihood values from all of the estimation rounds. The estimated parameter vector can be obtained as gsmar$params (and the corresponding approximate standard errors as gsmar$std_errors) and it's...

For non-restricted models:

For GMAR model:: Size $(M(p+3)-1x1)$ vector $\theta$$=$($\upsilon_{1}$,...,$\upsilon_{M}$, $\alpha_{1},...,\alpha_{M-1}$), where $\upsilon_{m}$$=(\phi_{m,0},$$\phi_{m}$$, \sigma_{m}^2)$ and $\phi_{m}$=$(\phi_{m,1},...,\phi_{m,p}), m=1,...,M$.
For StMAR model:: Size $(M(p+4)-1x1)$ vector ($\theta, \nu$)$=$($\upsilon_{1}$,...,$\upsilon_{M}$, $\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M}$).
For G-StMAR model:: Size $(M(p+3)+M2-1x1)$ vector ($\theta, \nu$)$=$($\upsilon_{1}$,...,$\upsilon_{M}$, $\alpha_{1},...,\alpha_{M-1}, \nu_{M1+1},...,\nu_{M}$).
With linear constraints:: Replace the vectors $\phi_{m}$ with vectors $\psi_{m}$ and provide a list of constraint matrices C that satisfy $\phi_{m}$$=$$C_{m}\psi_{m}$ for all $m=1,...,M$, where $\psi_{m}$$=(\psi_{m,1},...,\psi_{m,q_{m}})$.

For restricted models:

For GMAR model:: Size $(3M+p-1x1)$ vector $\theta$$=(\phi_{1,0},...,\phi_{M,0},$$\phi$$, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1})$, where $\phi$=$(\phi_{1},...,\phi_{M})$.
For StMAR model:: Size $(4M+p-1x1)$ vector ($\theta, \nu$)$=(\phi_{1,0},...,\phi_{M,0},$$\phi$$, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M})$.
For G-StMAR model:: Size $(3M+M2+p-1x1)$ vector ($\theta, \nu$)$=(\phi_{1,0},...,\phi_{M,0},$$\phi$$, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1}, \nu_{M1+1},...,\nu_{M})$.
With linear constraints:: Replace the vector $\phi$ with vector $\psi$ and provide a constraint matrix $C$ that satisfies $\phi$$=$$C\psi$, where $\psi$$=(\psi_{1},...,\psi_{q})$.

Symbol $\phi$ denotes an AR coefficient, $\sigma^2$ a variance, $\alpha$ a mixing weight, and $\nu$ a degrees of freedom parameter. If parametrization=="mean" just replace each intercept term $\phi_{m,0}$ with regimewise mean $\mu_m = \phi_{m,0}/(1-\sum\phi_{i,m})$. In the G-StMAR model, the first M1 components are GMAR type and the rest M2 components are StMAR type. Note that in the case M=1 the parameter $\alpha$ is dropped, and in the case of StMAR or G-StMAR model the degrees of freedom parameters $\nu_{m}$ have to be larger than $2$.

S3 methods

The following S3 methods are supported for class 'gsmar' objects: print, summary, plot, logLik, residuals.

Suggested packages

For faster evaluation of the quantile residuals for StMAR and G-StMAR models, install the suggested package "gsl". Note that for large StMAR and G-StMAR models with large data, performing the quantile residual tests may take significantly long time without the package "gsl".

Details

Because of complexity and multimodality of the log-likelihood function, it's not guaranteed that the estimation algorithm will end up in the global maximum point. It's often expected that most of the estimation rounds will end up in some local maximum point instead, and therefore a number of estimation rounds is required for reliable results. Because of the nature of the models, the estimation may fail particularly in the cases where the number of mixture components is chosen too large. Note that the genetic algorithm is designed to avoid solutions with mixing weights of some regimes too close to zero at almost all times ('redudant regimes') but the settings can, however, be adjusted (see ?GAfit).

If the iteration limit for the variable metric algorithm (maxit) is reached, one can continue the estimation by iterating more with the function iterate_more.

The core of the genetic algorithm is mostly based on the description by Dorsey and Mayer (1995). It utilizes a slightly modified version the individually adaptive crossover and mutation rates described by Patnaik and Srinivas (1994) and employs (50%) fitness inheritance discussed by Smith, Dike and Stegmann (1995). Large (in absolute value) but stationary AR parameter values are generated with the algorithm proposed by Monahan (1984).

The variable metric algorithm (or quasi-Newton method, Nash (1990, algorithm 21)) used in the second phase is implemented with function the optim from the package stats.

Some mixture components of the StMAR model may sometimes get very large estimates for the degrees of freedom parameters. Such parameters are weakly identified and induce various numerical problems. However, mixture components with large degree of freedom parameters are similar to the mixture components of the GMAR model. It's hence advisable to further estimate a G-StMAR model by allowing the mixture components with large degrees of freedom parameter estimates to be GMAR type.

References

Dorsey R. E. and Mayer W. J. 1995. Genetic algorithms for estimation problems with multiple optima, nondifferentiability, and other irregular features. Journal of Business & Economic Statistics, 13, 53-66.
Kalliovirta L., Meitz M. and Saikkonen P. 2015. Gaussian Mixture Autoregressive model for univariate time series. Journal of Time Series Analysis, 36, 247-266.
Meitz M., Preve D., Saikkonen P. 2018. A mixture autoregressive model based on Student's t-distribution. arXiv:1805.04010 [econ.EM].
Monahan J.F. 1984. A Note on Enforcing Stationarity in Autoregressive-Moving Average Models. Biometrica 71, 403-404.
Nash J. 1990. Compact Numerical Methods for Computers. Linear algebra and Function Minimization. Adam Hilger.
Patnaik L.M. and Srinivas M. 1994. Adaptive Probabilities of Crossover and Mutation in Genetic Algorithms. Transactions on Systems, Man and Cybernetics 24, 656-667.
Smith R.E., Dike B.A., Stegmann S.A. 1995. Fitness inheritance in genetic algorithms. Proceedings of the 1995 ACM Symposium on Applied Computing, 345-350.
Virolainen S. 2020. A mixture autoregressive model based on Gaussian and Student's t-distribution. arXiv:2003.05221 [econ.EM].

Examples

Run this code

# NOT RUN {
# These are long running examples that use parallel computing

# GMAR model
fit12 <- fitGSMAR(simudata, p=1, M=2, model="GMAR")
summary(fit12)
plot(fit12)
profile_logliks(fit12)

# StMAR model
fit42 <- fitGSMAR(data=T10Y1Y, p=4, M=2, model="StMAR")
fit42
summary(fit42)
plot(fit42)

# Restricted StMAR model: plot also the individual statistics with
# their approximate critical bounds using the given data
fit42r <- fitGSMAR(T10Y1Y, 4, 2, model="StMAR", restricted=TRUE)
fit42r
plot(fit42)

# Non-mixture version of StMAR model
fit101t <- fitGSMAR(T10Y1Y, 10, 1, model="StMAR", ncores=1, ncalls=1)
diagnosticPlot(fit101t)

# G-StMAR model with one GMAR type and one StMAR type regime
fit42g <- fitGSMAR(T10Y1Y, 4, M=c(1, 1), model="G-StMAR")
diagnosticPlot(fit42g)

# GMAR model; seeds for rerpoducibility
fit43gm <- fitGSMAR(T10Y1Y, 4, M=3, model="GMAR", ncalls=16,
  seeds=1:16)
fit43gm

# Restricted GMAR model
fit43gmr <- fitGSMAR(T10Y1Y, 4, M=3, model="GMAR", ncalls=12,
  restricted=TRUE, seeds=1:12)
fit43gmr


# The following three examples demonstrate how to apply linear constraints
# to the AR parameters.

# Two-regime GMAR p=2 model with the second AR coeffiecient of
# of the second regime contrained to zero.
constraints <- list(diag(1, ncol=2, nrow=2), as.matrix(c(1, 0)))
fit22c <- fitGSMAR(T10Y1Y, 2, 2, constraints=constraints)
fit22c

# Such constrained StMAR(3, 1) model that the second order AR coefficient
# is constrained to zero.
constraints <- list(matrix(c(1, 0, 0, 0, 0, 1), ncol=2))
fit31tc <- fitGSMAR(T10Y1Y, 3, 1, model="StMAR", constraints=constraints)
fit31tc

# Such StMAR(3,2) that the AR coefficients are restricted to be
# the same for both regimes and that the second AR coefficients are
# constrained to zero.
fit32rc <- fitGSMAR(T10Y1Y, 3, 2, model="StMAR", restricted=TRUE,
 constraints=matrix(c(1, 0, 0, 0, 0, 1), ncol=2))
fit32rc
# }

Run the code above in your browser using DataLab