optDesign: Find the next optimal design point for simulation-based inference

Description

optDesign finds the next design point at which simulation should be carried out for approximately best efficiency in a metamodel-based inference. See Park (2025) for more details on this method. It takes a class simll object.

Usage

# S3 method for simll
optDesign(
  simll,
  init = NULL,
  weight = 1,
  autoAdjust = TRUE,
  refgap = Inf,
  refgap_for_comp = NULL,
  ...
)

Value

A list containing the following entries.

par: a proposal for the next simulation point.
logSTV: the logarithm of the approximate scaled total variation (STV) evaluated at the proposed simulation point.
wadj_new: the adjusted weight for the newly proposed simulation point.
Wadj: the vector of all adjusted weights for the existing simulation points.
refgap: the tuned value of g for weight adjustments.
logSTV_for_comp: when refgap_for_comp is not NULL, log(STV) is evaluated using the provided value of refgap_for_comp and reported as logSTV_for_comp.

Arguments

simll: A class simll object, containing simulation log likelihoods, the parameter values at which simulations are made, and the weights for those simulations for regression (optional). See help(simll).
init: (optional) An initial parameter vector at which a search for optimal point starts.
weight: (optional) A positive real number indicating the user-assigned weight for the new design point. The default value is 1. This value should be chosen relative to the weights in the provided simll object.
autoAdjust: logical. If TRUE, simulation points at which the third order term is statistically significant in the cubic approximation to the simulated log-likelihooods have discounted weights for metamodel fitting. The weights of the points relatively far from the estimated MESLE are more heavily discounted. These weight discount factors are multiplied to the originally given weights for parameter estimation. See Park (2025) for more details. If autoAdjust is FALSE, the weight discount step is skipped. Defaults to TRUE.
refgap: A positive real number that determines the weight discount factor for the significance of the third order term in Taylor approximation. The weight of a point theta is discounted by a factor of exp(-(qa(theta)-qa(MESLEhat))/refgap), where MESLEhat is the estimated MESLE and qa is the quadratic approximation to the simulated log-likelihoods. If autoAdjust is TRUE, refgap is interpreted as the initial value for the tuning algorithm. If autoAdjust is FALSE, refgap is used for weight adjustments without further tuning. The default value is Inf.
refgap_for_comp: (optional) A value of refgap with which to compute the log(STV) to be reported at the end. A potential use for this argument is to compare log(STV) values across iterative applications of this function, as the reported logSTV value can vary significantly depending on the tuned value of refgap.
...: Other optional arguments, not currently used.

Details

This is a generic function, taking a class simll object as the first argument. Parameter inference for implicitly defined simulation models can be carried out under a metamodel for the distribution of the log-likelihood estimator. See function ht for hypothesis testing and ci for confidence interval construction for a one-dimensional parameter. This function, optDesign, proposes the next point at which a simulation is to be carried out such that the variance of the parameter estimate is reduced approximately the most. In order to balance efficiency and accuracy, the point is selected as far as possible from the current estimate of the parameter while ensuring that the quadratic approximation to the simulated log-likelihoods remain valid. Specifically, the weights for the existing simulation points are adjusted such that the third order term in a cubic approximation is statistically insignificant. The weight discount factor for point theta is given by exp(-(qa(theta)-qa(MESLEhat))/g), where qa is the quadratic approximation, MESLEhat is the estimated MLE, and g is a scaling parameter. These discount factors are multipled to the original weights given to the simulation points specified in the simll object. Moreover, in order to ensure that the cubic regression can be carried out without numerical issues, g is guaranteed not to fall below a value that makes the effective sample size (ESS) below (d+1)(d+2)(d+3)/6, which is the total number of parameter estimated in cubic regression, where d is the parameter dimension. Here ESS is calculated as (sum of adjusted weights)^2/(sum of squared adjusted weights).

The next simulation point is selected by approximately minimizing the scaled total Monte Carlo variation of the parameter estimate. The scaled total variation (STV) is defined as the trace of c_hat^{-1} V where c_hat is the quadratic coefficient matrix of the fitted quadratic polynomial and V is an approximate Monte Carlo variance of the estimate of the MESLE given by -(1/2) * c_hat^{-1} b_hat (here b_hat is the linear coefficient vector of the fitted quadratic polynomial.) The optimization is carried out using the BFGS algorithm via the optim function. See Park (2025) for more details.

References

Park, J. (2025). Scalable simulation-based inference for implicitly defined models using a metamodel for Monte Carlo log-likelihood estimator tools:::Rd_expr_doi("10.48550/arxiv.2311.09446")