control.stergm: Auxiliary for Controlling Separable Temporal ERGM Fitting

Description

Auxiliary function as user interface for fine-tuning 'stergm' fitting.

Usage

control.stergm(
  init.form = NULL,
  init.diss = NULL,
  init.method = NULL,
  force.main = FALSE,
  MCMC.prop.form = ~discord + sparse,
  MCMC.prop.diss = ~discord + sparse,
  MCMC.prop.weights.form = "default",
  MCMC.prop.args.form = NULL,
  MCMC.prop.weights.diss = "default",
  MCMC.prop.args.diss = NULL,
  MCMC.maxedges = Inf,
  MCMC.maxchanges = 1e+06,
  MCMC.packagenames = c(),
  CMLE.MCMC.burnin = 1024 * 16,
  CMLE.MCMC.interval = 1024,
  CMLE.ergm = NULL,
  CMLE.form.ergm = control.ergm(init = init.form, MCMC.burnin = CMLE.MCMC.burnin,
    MCMC.interval = CMLE.MCMC.interval, MCMC.prop = MCMC.prop.form, MCMC.prop.weights =
    MCMC.prop.weights.form, MCMC.prop.args = MCMC.prop.args.form, MCMC.maxedges =
    MCMC.maxedges, MCMC.packagenames = MCMC.packagenames, parallel = parallel,
    parallel.type = parallel.type, parallel.version.check = parallel.version.check,
    parallel.inherit.MT = parallel.inherit.MT, force.main = force.main),
  CMLE.diss.ergm = control.ergm(init = init.diss, MCMC.burnin = CMLE.MCMC.burnin,
    MCMC.interval = CMLE.MCMC.interval, MCMC.prop = MCMC.prop.diss, MCMC.prop.weights =
    MCMC.prop.weights.diss, MCMC.prop.args = MCMC.prop.args.diss, MCMC.maxedges =
    MCMC.maxedges, MCMC.packagenames = MCMC.packagenames, parallel = parallel,
    parallel.type = parallel.type, parallel.version.check = parallel.version.check,
    parallel.inherit.MT = parallel.inherit.MT, force.main = force.main),
  CMLE.NA.impute = c(),
  CMLE.term.check.override = FALSE,
  EGMME.main.method = c("Gradient-Descent"),
  EGMME.initialfit.control = control.ergm(),
  EGMME.MCMC.burnin.min = 1000,
  EGMME.MCMC.burnin.max = 1e+05,
  EGMME.MCMC.burnin.pval = 0.5,
  EGMME.MCMC.burnin.add = 1,
  MCMC.burnin = NULL,
  MCMC.burnin.mul = NULL,
  SAN.maxit = 4,
  SAN.nsteps.times = 8,
  SAN = control.san(term.options = term.options, SAN.maxit = SAN.maxit, SAN.prop =
    MCMC.prop.form, SAN.prop.weights = MCMC.prop.weights.form, SAN.prop.args =
    MCMC.prop.args.form, SAN.nsteps = round(sqrt(EGMME.MCMC.burnin.min *
    EGMME.MCMC.burnin.max)) * SAN.nsteps.times, SAN.packagenames = MCMC.packagenames,
    parallel = parallel, parallel.type = parallel.type, parallel.version.check =
    parallel.version.check, parallel.inherit.MT = FALSE),
  SA.restarts = 10,
  SA.burnin = 1000,
  SA.plot.progress = FALSE,
  SA.max.plot.points = 400,
  SA.plot.stats = FALSE,
  SA.init.gain = 0.1,
  SA.gain.decay = 0.5,
  SA.runlength = 25,
  SA.interval.mul = 2,
  SA.init.interval = 500,
  SA.min.interval = 20,
  SA.max.interval = 500,
  SA.phase1.minruns = 4,
  SA.phase1.tries = 20,
  SA.phase1.jitter = 0.1,
  SA.phase1.max.q = 0.1,
  SA.phase1.backoff.rat = 1.05,
  SA.phase2.levels.max = 40,
  SA.phase2.levels.min = 4,
  SA.phase2.max.mc.se = 0.001,
  SA.phase2.repeats = 400,
  SA.stepdown.maxn = 200,
  SA.stepdown.p = 0.05,
  SA.stop.p = 0.1,
  SA.stepdown.ct = 5,
  SA.phase2.backoff.rat = 1.1,
  SA.keep.oh = 0.5,
  SA.keep.min.runs = 8,
  SA.keep.min = 0,
  SA.phase2.jitter.mul = 0.2,
  SA.phase2.maxreljump = 4,
  SA.guard.mul = 4,
  SA.par.eff.pow = 1,
  SA.robust = FALSE,
  SA.oh.memory = 1e+05,
  SA.refine = c("mean", "linear", "none"),
  SA.se = TRUE,
  SA.phase3.samplesize.runs = 10,
  SA.restart.on.err = TRUE,
  term.options = NULL,
  seed = NULL,
  parallel = 0,
  parallel.type = NULL,
  parallel.version.check = TRUE,
  parallel.inherit.MT = FALSE,
  ...
)

Value

A list with arguments as components.

Arguments

init.form, init.diss

numeric or NA vector equal in length to the number of parameters in the formation/dissolution model or NULL (the default); the initial values for the estimation and coefficient offset terms. If NULL is passed, all of the initial values are computed using the method specified by control$init.method. If a numeric vector is given, the elements of the vector are interpreted as follows:

Elements corresponding to terms enclosed in offset() are used as the fixed offset coefficients. These should match the offset values given in offset.coef.form and offset.coef.diss.
Elements that do not correspond to offset terms and are not NA are used as starting values in the estimation.
Initial values for the elements that are NA are fit using the method specified by control$init.method.

Passing coefficients from a previous run can be used to "resume" an uncoverged stergm() run.

init.method

Estimation method used to acquire initial values for estimation. If NULL (the default), the initial values are computed using the edges dissolution approximation (Carnegie et al.) when appropriate; note that this relies on .extract.fd.formulae() to identify the formation and dissolution parts of the formula; the user should be aware of its behavior and limitations. If init.method is set to "zeros", the initial values are set to zeros.

force.main

Logical: If TRUE, then force MCMC-based estimation method, even if the exact MLE can be computed via maximum pseudolikelihood estimation.

MCMC.prop.form

Hints and/or constraints for selecting and initializing the proposal.

MCMC.prop.weights.form

Specifies the proposal weighting to use.

MCMC.prop.args.form

A direct way of specifying arguments to the proposal.

MCMC.prop.weights.diss, MCMC.prop.args.diss, MCMC.prop.diss

Ignored.

MCMC.maxedges

The maximum number of edges that may occur during the MCMC sampling. If this number is exceeded at any time, sampling is stopped immediately.

MCMC.maxchanges

Maximum number of changes in dynamic network simulation for which to allocate space.

MCMC.packagenames

Names of packages in which to look for change statistic functions in addition to those autodetected. This argument should not be needed outside of very strange setups.

CMLE.MCMC.burnin

Burnin used in CMLE fitting.

CMLE.MCMC.interval

Number of Metropolis-Hastings steps between successive draws when running MCMC MLE.

CMLE.ergm

A convenience argument for specifying both CMLE.form.ergm and CMLE.diss.ergm at once. See control.ergm().

CMLE.form.ergm

Control parameters used to fit the CMLE. See control.ergm().

CMLE.diss.ergm

Ignored, with the exception of initial parameter values.

CMLE.NA.impute

In STERGM CMLE, missing dyads in transitioned-to networks are accommodated using methods of Handcock and Gile (2009), but a similar approach to transitioned-from networks requires much more complex methods that are not, currently, implemented. CMLE.NA.impute controls how missing dyads in transitioned-from networks are be imputed. See argument imputers of impute.network.list() for details.

By default, no imputation is performed, and the fitting stops with an error if any transitioned-from networks have missing dyads.

CMLE.term.check.override

The method stergm() uses at this time to fit a series of more than two networks requires certain assumptions to be made about the ERGM terms being used, which are tested before a fit is attempted. This test sometimes fails despite the model being amenable to fitting, so setting this option to TRUE overrides the tests.

EGMME.main.method

Estimation method used to find the Equilibrium Generalized Method of Moments estimator. Currently only "Gradient-Descent" is implemented.

EGMME.initialfit.control

Control object for the ergm fit in tergm.EGMME.initialfit

EGMME.MCMC.burnin.min, EGMME.MCMC.burnin.max,

Number of Metropolis-Hastings steps per time step used in EGMME fitting. By default, this is determined adaptively by keeping track of increments in the Hamming distance between the transitioned-from network and the network being sampled. Once EGMME.MCMC.burnin.min steps have elapsed, the increments are tested against 0, and when their average number becomes statistically indistinguishable from 0 (with the p-value being greater than EGMME.MCMC.burnin.pval), or EGMME.MCMC.burnin.max steps are proposed, whichever comes first, the simulation is stopped after an additional EGMME.MCMC.burnin.add times the number of elapsed steps had been taken. (Stopping immediately would bias the sampling.)

To use a fixed number of steps, set EGMME.MCMC.burnin.min and EGMME.MCMC.burnin.max to the same value.

EGMME.MCMC.burnin.pval, EGMME.MCMC.burnin.add

To use a fixed number of steps, set EGMME.MCMC.burnin.min and EGMME.MCMC.burnin.max to the same value.

MCMC.burnin, MCMC.burnin.mul

No longer used. See EGMME.MCMC.burnin.min, EGMME.MCMC.burnin.max, EGMME.MCMC.burnin.pval, EGMME.MCMC.burnin.pval, EGMME.MCMC.burnin.add and CMLE.MCMC.burnin and CMLE.MCMC.interval.

SAN.maxit

When target.stats argument is passed to ergm(), the maximum number of attempts to use san() to obtain a network with statistics close to those specified.

SAN.nsteps.times

Multiplier for SAN.nsteps relative to MCMC.burnin. This lets one control the amount of SAN burn-in (arguably, the most important of SAN parameters) without overriding the other SAN defaults.

SAN

SAN control parameters. See control.san()

SA.restarts

Maximum number of times to restart a failed optimization process.

SA.burnin

Number of time steps to advance the starting network before beginning the optimization.

SA.plot.progress, SA.plot.stats

Logical: Plot information about the fit as it proceeds. If SA.plot.progress==TRUE, plot the trajectories of the parameters and target statistics as the optimization progresses. If SA.plot.stats==TRUE, plot a heatmap representing correlations of target statistics and a heatmap representing the estimated gradient.

Do NOT use these with non-interactive plotting devices like pdf(). (In fact, it will refuse to do that with a warning.)

SA.max.plot.points

If SA.plot.progress==TRUE, the maximum number of time points to be plotted. Defaults to 400. If more iterations elapse, they will be thinned to at most 400 before plotting.

SA.init.gain

Initial gain, the multiplier for the parameter update size. If the process initially goes crazy beyond recovery, lower this value.

SA.gain.decay

Gain decay factor.

SA.runlength

Number of parameter trials and updates per C run.

SA.interval.mul

The number of time steps between updates of the parameters is set to be this times the mean duration of extant ties.

SA.init.interval

Initial number of time steps between updates of the parameters.

SA.min.interval, SA.max.interval

Upper and lower bounds on the number of time steps between updates of the parameters.

SA.phase1.minruns

Number of runs during Phase 1 for estimating the gradient, before every gradient update.

SA.phase1.tries

Number of runs trying to find a reasonable parameter and network configuration.

SA.phase1.jitter

Initial jitter standard deviation of each parameter.

SA.phase1.max.q

Q-value (false discovery rate) that a gradient estimate must obtain before it is accepted (since sign is what is important).

SA.phase1.backoff.rat, SA.phase2.backoff.rat

If the run produces this relative increase in the approximate objective function, it will be backed off.

SA.phase2.levels.min, SA.phase2.levels.max

Range of gain levels (subphases) to go through.

SA.phase2.max.mc.se

Approximate precision of the estimates that must be attained before stopping.

SA.phase2.repeats, SA.stepdown.maxn,

A gain level may be repeated multiple times (up to SA.phase2.repeats) if the optimizer detects that the objective function is improving or the estimating equations are not centered around 0, so slowing down the parameters at that point is counterproductive. To detect this it looks at the the window controlled by SA.keep.oh, thinning objective function values to get SA.stepdown.maxn, and 1) fitting a GLS model for a linear trend, with AR(2) autocorrelation and 2) conductiong an approximate Hotelling's T^2 test for equality of estimating equation values to 0. If there is no significance for either at SA.stepdown.p SA.stepdown.ct runs in a row, the gain level (subphase) is allowed to end. Otherwise, the process continues at the same gain level.

SA.stepdown.p, SA.stepdown.ct

SA.stop.p

At the end of each gain level after the minimum, if the precision is sufficiently high, the relationship between the parameters and the targets is tested for evidence of local nonlinearity. This is the p-value used.

If that test fails to reject, a Phase 3 run is made with the new parameter values, and the estimating equations are tested for difference from 0. If this test fails to reject, the optimization is finished.

If either of these tests rejects, at SA.stop.p, optimization is continued for another gain level.

SA.keep.oh, SA.keep.min, SA.keep.min.runs

Parameters controlling how much of optimization history to keep for gradient and covariance estimation.

A history record will be kept if it's at least one of the following:

Among the last SA.keep.oh (a fraction) of all runs.
Among the last SA.keep.min (a count) records.
From the last SA.keep.min.runs (a count) optimization runs.

SA.phase2.jitter.mul

Jitter standard deviation of each parameter is this value times its standard deviation without jitter.

SA.phase2.maxreljump

To keep the optimization from "running away" due to, say, a poor gradient estimate building on itself, if a magnitude of change (Mahalanobis distance) in parameters over the course of a run divided by average magnitude of change for recent runs exceeds this, the change is truncated to this amount times the average for recent runs.

SA.guard.mul

The multiplier for the range of parameter and statistics values to compute the guard width.

SA.par.eff.pow

Because some parameters have much, much greater effects than others, it improves numerical conditioning and makes estimation more stable to rescale the $k$th estimating function by $s_k = (\sum_{i=1}^{q} G_{i,k}^2/V_{i,i})^{-p/2}$, where $G_{i,k}$ is the estimated gradient of the $i$th target statistics with respect to $k$th parameter. This parameter sets the value of $p$: 0 for no rescaling, 1 (default) for scaling by root-mean-square normalized gradient, and greater values for greater penalty.

SA.robust

Whether to use robust linear regression (for gradients) and covariance estimation.

SA.oh.memory

Absolute maximum number of data points per thread to store in the full optimization history.

SA.refine

Method, if any, used to refine the point estimate at the end: "linear" for linear interpolation, "mean" for average, and "none" to use the last value.

SA.se

Logical: If TRUE (the default), get an MCMC sample of statistics at the final estimate and compute the covariance matrix (and hence standard errors) of the parameters. This sample is stored and can also be used by mcmc.diagnostics() to assess convergence.

SA.phase3.samplesize.runs

This many optimization runs will be used to determine whether the optimization has converged and to estimate the standard errors.

SA.restart.on.err

Logical: if TRUE (the default) an error somewhere in the optimization process will cause it to restart with a smaller gain value. Otherwise, the process will stop. This is mainly used for debugging

term.options

A list of additional arguments to be passed to term initializers. See ? term.options.

seed

Seed value (integer) for the random number generator. See set.seed().

parallel

Number of threads in which to run the sampling. Defaults to 0 (no parallelism). See ergm-parallel for details and troubleshooting.

parallel.type

API to use for parallel processing. Defaults to using the parallel package with PSOCK clusters. See ergm-parallel.

parallel.version.check

Logical: If TRUE, check that the version of ergm running on the slave nodes is the same as that running on the master node.

parallel.inherit.MT

Logical: If TRUE, slave nodes and processes inherit the set.MT_terms() setting.

...

Additional arguments, passed to other functions This argument is helpful because it collects any control parameters that have been deprecated; a warning message is printed in case of deprecated arguments.

Details

This function is only used within a call to the stergm() function. See the Usage section in stergm() for details. Generally speaking, control.stergm is remapped to control.tergm, with dissolution controls ignored and formation controls used as controls for the overall tergm process. An exception to this rule is the initial parameter values specified via init.form, init.diss, CMLE.form.ergm$init, and CMLE.diss.ergm$init, which will be remapped jointly with the stergm() arguments offset.coef.form and offset.coef.diss to determine the initial parameter values passed to tergm.

It is recommended that new code make use of tergm and control.tergm directly; stergm wrappers are included only for backwards compatibility.

References

Boer, P., Huisman, M., Snijders, T.A.B., and Zeggelink, E.P.H. (2003), StOCNET User\'s Manual. Version 1.4.

Firth (1993), Bias Reduction in Maximum Likelihood Estimates. Biometrika, 80: 27-38.

Hunter, D. R. and M. S. Handcock (2006), Inference in curved exponential family models for networks. Journal of Computational and Graphical Statistics, 15: 565-583.