This function simulates sets of time series data for fitting a multivariate GAM that includes shared seasonality and dependence on state-space latent dynamic factors. Random dependencies among series, i.e. correlations in their long-term trends, are included in the form of correlated loadings on the latent dynamic factors
sim_mvgam(
T = 100,
n_series = 3,
seasonality = "shared",
use_lv = FALSE,
n_lv = 0,
trend_model = RW(),
drift = FALSE,
prop_trend = 0.2,
trend_rel,
freq = 12,
family = poisson(),
phi,
shape,
sigma,
nu,
mu,
prop_missing = 0,
prop_train = 0.85
)A list object containing outputs needed for mvgam,
including 'data_train' and 'data_test', as well as some additional information
about the simulated seasonality and trend dependencies
integer. Number of observations (timepoints)
integer. Number of discrete time series
character. Either shared, meaning that
all series share the exact same seasonal pattern,
or hierarchical, meaning that there is a global seasonality but
each series' pattern can deviate slightly
logical. If TRUE, use dynamic factors to estimate series'
latent trends in a reduced dimension format. If FALSE, estimate independent
latent trends for each series
integer. Number of latent dynamic factors for generating the series' trends.
Defaults to 0, meaning that dynamics are estimated independently for each series
character specifying the time series dynamics for the latent trend.
Options are:
None (no latent trend component; i.e. the GAM component is all that
contributes to the linear predictor, and the observation process is the only
source of error; similarly to what is estimated by gam)
RW (random walk with possible drift)
AR1 (with possible drift)
AR2 (with possible drift)
AR3 (with possible drift)
VAR1 (contemporaneously uncorrelated VAR1)
VAR1cor (contemporaneously correlated VAR1)
GP (Gaussian Process with squared exponential kernel)
See mvgam_trends for more details
logical, simulate a drift term for each trend
numeric. Relative importance of the trend for each series.
Should be between 0 and 1
Deprecated. Use prop_trend instead
integer. The seasonal frequency of the series
family specifying the exponential observation family for the series.
Currently supported
families are: nb(), poisson(), bernoulli(), tweedie(), gaussian(),
betar(), lognormal(), student() and Gamma()
vector of dispersion parameters for the series
(i.e. size for nb() or
phi for betar()). If length(phi) < n_series,
the first element of phi will
be replicated n_series times.
Defaults to 5 for nb() and tweedie(); 10 for
betar()
vector of shape parameters for the series
(i.e. shape for gamma())
If length(shape) < n_series, the first element of shape will
be replicated n_series times. Defaults to 10
vector of scale parameters for the series
(i.e. sd for gaussian() or student(),
log(sd) for lognormal()). If length(sigma) < n_series, the first element of sigma will
be replicated n_series times. Defaults to 0.5 for gaussian() and
student(); 0.2 for lognormal()
vector of degrees of freedom parameters for the
series (i.e. nu for student())
If length(nu) < n_series, the first element of nu will
be replicated n_series times. Defaults to 3
vector of location parameters for the series.
If length(mu) < n_series, the first element of mu will
be replicated n_series times. Defaults to small random values
between -0.5 and 0.5 on the link scale
numeric stating proportion of observations that are missing.
Should be between
0 and 0.8, inclusive
numeric stating the proportion of data to use for training.
Should be between 0.2 and 1
# Simulate series with observations bounded at 0 and 1 (Beta responses)
sim_data <- sim_mvgam(family = betar(), trend_model = RW(), prop_trend = 0.6)
plot_mvgam_series(data = sim_data$data_train, series = 'all')
# Now simulate series with overdispersed discrete observations
sim_data <- sim_mvgam(family = nb(), trend_model = RW(), prop_trend = 0.6, phi = 10)
plot_mvgam_series(data = sim_data$data_train, series = 'all')
Run the code above in your browser using DataLab