Learn R Programming

jaatha (version 3.2.5)

create_jaatha_stat: Create a summary statistic for Jaatha

Description

This function creates summary statistics for Jaatha models. A summary statistic consists primarily of a function that calculates the statistic from the simulation results. Jaatha primarily supports Poisson distributed summary statistics, but can also transform summary statistics that follow a different distribution in approximately Poisson distributed statistics.

Usage

create_jaatha_stat(name, calc_func, poisson = TRUE, breaks = c(0.1, 0.5, 0.9))

Value

The summary statistic. Indented for being used with

create_jaatha_model.

Arguments

name

The name of the summary statistic

calc_func

The function that summarizes the simulation data. Must take two arguments. The first is the simulated data, and the second are options that can be calculated from the real data. Ignoring the second argument in the function body should be fine in most situations. The function must return a numeric vector if poisson = TRUE, and can also return a numeric matrix if poisson = FALSE.

poisson

If TRUE, it is assumed that the summary statistic values are (at least approximately) independent and Poisson distributed. If it is set to FALSE, the statistic is transformed into an approximately Poisson distributed array using a binning approach. See "Transformation of non Poisson distributed statistics" for details. If any summary statistic is only approximately Poisson distributed, Jaatha is a composite-likelihood method.

breaks

The probabilities for the quantiles that are used for binning the data. See the section on non Poisson distributed summary statistics for details.

Transformation of non Poisson distributed statistics

To transform a statistic into approximately Poisson distributed values, we first calculate the empirical quantiles of the real data for the probabilities given in breaks. These are used as break points for divining the range of the statistic into disjunct intervals. We then count who many of the values for the simulated data fall into each intervals, and use this counts as summary statistic. The counts are multinomial distributed, and should be close to the required Poisson distribution in most cases.