This function simulates sets of time series data for fitting a multivariate GAM that includes shared seasonality and dependence on state-space latent dynamic factors. Random dependencies among series, i.e. correlations in their long-term trends, are included in the form of correlated loadings on the latent dynamic factors
sim_mvgam(
T = 100,
n_series = 3,
seasonality = "shared",
use_lv = FALSE,
n_lv = 0,
trend_model = RW(),
drift = FALSE,
prop_trend = 0.2,
trend_rel,
freq = 12,
family = poisson(),
phi,
shape,
sigma,
nu,
mu,
prop_missing = 0,
prop_train = 0.85
)
A list
object containing outputs needed for mvgam
,
including 'data_train' and 'data_test', as well as some additional information
about the simulated seasonality and trend dependencies
integer
. Number of observations (timepoints)
integer
. Number of discrete time series
character
. Either shared
, meaning that
all series share the exact same seasonal pattern,
or hierarchical
, meaning that there is a global seasonality but
each series' pattern can deviate slightly
logical
. If TRUE
, use dynamic factors to estimate series'
latent trends in a reduced dimension format. If FALSE
, estimate independent
latent trends for each series
integer
. Number of latent dynamic factors for generating the series' trends.
Defaults to 0
, meaning that dynamics are estimated independently for each series
character
specifying the time series dynamics for the latent trend.
Options are:
None
(no latent trend component; i.e. the GAM component is all that
contributes to the linear predictor, and the observation process is the only
source of error; similarly to what is estimated by gam
)
RW
(random walk with possible drift)
AR1
(with possible drift)
AR2
(with possible drift)
AR3
(with possible drift)
VAR1
(contemporaneously uncorrelated VAR1)
VAR1cor
(contemporaneously correlated VAR1)
GP
(Gaussian Process with squared exponential kernel)
See mvgam_trends for more details
logical
, simulate a drift term for each trend
numeric
. Relative importance of the trend for each series.
Should be between 0
and 1
Deprecated. Use prop_trend
instead
integer
. The seasonal frequency of the series
family
specifying the exponential observation family for the series.
Currently supported
families are: nb()
, poisson()
, bernoulli()
, tweedie()
, gaussian()
,
betar()
, lognormal()
, student()
and Gamma()
vector
of dispersion parameters for the series
(i.e. size
for nb()
or
phi
for betar()
). If length(phi) < n_series
,
the first element of phi
will
be replicated n_series
times.
Defaults to 5
for nb()
and tweedie()
; 10
for
betar()
vector
of shape parameters for the series
(i.e. shape
for gamma()
)
If length(shape) < n_series
, the first element of shape
will
be replicated n_series
times. Defaults to 10
vector
of scale parameters for the series
(i.e. sd
for gaussian()
or student()
,
log(sd)
for lognormal()
). If length(sigma) < n_series
, the first element of sigma
will
be replicated n_series
times. Defaults to 0.5
for gaussian()
and
student()
; 0.2
for lognormal()
vector
of degrees of freedom parameters for the
series (i.e. nu
for student()
)
If length(nu) < n_series
, the first element of nu
will
be replicated n_series
times. Defaults to 3
vector
of location parameters for the series.
If length(mu) < n_series
, the first element of mu
will
be replicated n_series
times. Defaults to small random values
between -0.5
and 0.5
on the link scale
numeric
stating proportion of observations that are missing.
Should be between
0
and 0.8
, inclusive
numeric
stating the proportion of data to use for training.
Should be between 0.2
and 1
# Simulate series with observations bounded at 0 and 1 (Beta responses)
sim_data <- sim_mvgam(family = betar(), trend_model = RW(), prop_trend = 0.6)
plot_mvgam_series(data = sim_data$data_train, series = 'all')
# Now simulate series with overdispersed discrete observations
sim_data <- sim_mvgam(family = nb(), trend_model = RW(), prop_trend = 0.6, phi = 10)
plot_mvgam_series(data = sim_data$data_train, series = 'all')
Run the code above in your browser using DataLab