compareNCestimates: Conditional Sequential Monte Carlo Examples

Description

The compareNCestimates function generates a Monte Carlo study to compare log-likelihood (normalizing constant) estimates in the standard linear Gaussian state space (LGSS) model: Kalman filter estimates, as the benchmark, are compared to the standard bootstrap particle filter and the conditional bootstrap particle filter estimates (see Details).

The simGaussianSSM function simulates data from a LGSS model (can be used manually to simulate data or runs as a default, if no data is provided, with a default parameter setup, see parameters).

The kalmanFFBS function runs a Kalman (exact) forward filter, computes a log-likelihood estimate and generates a joint smoothing state trajectory via a backward simulation pass.

Usage

compareNCestimates(dataY,
                     trueStates = NULL,
                     numParticles = 1000L,
                     simNumber = 100L,
                     modelParameters = list(stateInit = 0,
                                            phi = 0.7,
                                            varStateEvol = 1,
                                            varObs = 1),
                     plot = FALSE)
  simGaussianSSM(len = 100,
                 stateInit = 0,
                 phi = 0.7,
                 varStateEvol = 1,
                 varObs = 1)
  kalmanFFBS(dataY,
             stateInit,
             phi,
             varStateEvol,
             varObs,
             simNumber)

Value

compareNCestimates returns a named list of two

outSMC a named list of two:
- smcOut: a matrix of dimension simNum x 4 which contains single log-likelihood estimates of the standard BPF for each of the 4 resampling schemes and for each simulation run
- csmcOut: a matrix of dimension simNum x 4 which contains single log-likelihood estimates of the conditional BPF for each of the 4 resampling schemes and for each simulation run
outKalman the output of kalmanFFBS, see below

kalmanFFBS returns a named list of two:

logLikeliEstim: (exact) estimate of the log-likelihood
xBackwardSimul: a backward simulation (joint smoothing) trajectory of latent states given parameters and measurement

Arguments

dataY

A one-column matrix or dataframe or vector containing measurements (y values) from a standard linear Gaussian SSM. If not provided, defaults to a LGSS model with time series lenght len=250 and parameter setup specified with default values in the parameters argument, see simGaussianSSM or compareNCestimates.

trueStates

defaults to NULL for a real dataset as the true state values are not observed; for simulated data, these can be passed and then will alse be plotted if plot=TRUE

numParticles

An integer specifying the number of particles.

simNumber

An integer specifying the number of repeated simulation runs of each of which produces 2x4=8 normalizing constant esimtates: BPF and conditional BPF esimates under four conditional resampling schemes, as well as a ground truth Kalman forward filter estimate and a backward filter output required for the reference trajectory of the conditional sampler.

modelParameters

a named list of parameters of the LGSSM model in the following order:

phi: autoregressive parameter
stateInit: initial state value (i.e. $X_0$)
varStateEvol: state process variance
varObs: measurement/observation process variance

These parameters are used to for the Kalman forward filtering and backward simulation pass, and, if no data argument is provided, to simulate data from a linear Gaussian state space model internally via simGaussianSSM.

phi

autoregressive parameter

stateInit

initial state value (i.e. $X_0$)

varStateEvol

state process variance

varObs

measurement/observation process variance

plot

A boolean variable describing whether plot should illustrate the estimated results along with the data.

len

Length of data series to simulate.

Author

Adam M. Johansen, Dirk Eddelbuettel, Leah South and Ilya Zarubin

Details

compareNCestimates runs a simulation study that provides log-likelihood (normalizing constant) estimates; there are simNumber runs of the standard BPF and the conditional BPF under four resampling schmes:

multinomial
stratified
systematic
residual

The "ground truth" Kalman forward filter estimate of the normalizing constant is compared to the BPF normalizing constant estimates, which are unbiased for all above schemes; specifically the conditional BPF estimate is unbiased if the reference trajectory is simulated from the target distribution which is obtained here as a backward simulation run of the Kalman filter.

Box plots illustrate the unbiasedness of standard BPF and conditional BPF estimates for the normailizing constant estimate in the linear Gaussian SSM, and serve as an small example for to illustrate conditional SMC algorithms (in their most basic BPF version) with different conditonal resampling schemes as implemented within RcppSMC.

simGaussianSSM simulates from a Linear Gaussian state space model of the following form: $$x_t=\phi x_{t-1} + u_t$$ $$y_t= x_{t} + w_t$$ where $\phi$ is set via the phi argument, $u_t \sim N(0, \sigma_x^2)$, $w_t \sim N(0, \sigma_y^2)$ for which the innovation ($\sigma_x^2$) and measurement ($\sigma_y^2$) variances are set via arguments varStateEvol and varObs, respectively.

References

A. M. Johansen. SMCTC: Sequential Monte Carlo in C++. Journal of Statistical Software, 30(6):1-41, April 2009. https://www.jstatsoft.org/article/view/v030i06