This function is only used within a call to the ergm()
function.
See the Usage section in ergm()
for details. Also see the
Details section about some of the interactions between its
arguments.
control.ergm(
drop = TRUE,
init = NULL,
init.method = NULL,
main.method = c("MCMLE", "Stochastic-Approximation"),
force.main = FALSE,
main.hessian = TRUE,
checkpoint = NULL,
resume = NULL,
MPLE.samplesize = .Machine$integer.max,
init.MPLE.samplesize = function(d, e) max(sqrt(d), e, 40) * 8,
MPLE.type = c("glm", "penalized", "logitreg"),
MPLE.maxit = 10000,
MPLE.nonvar = c("warning", "message", "error"),
MPLE.nonident = c("warning", "message", "error"),
MPLE.nonident.tol = 1e-10,
MPLE.covariance.samplesize = 500,
MPLE.covariance.method = "invHess",
MPLE.covariance.sim.burnin = 1024,
MPLE.covariance.sim.interval = 1024,
MPLE.check = TRUE,
MPLE.constraints.ignore = FALSE,
MCMC.prop = trim_env(~sparse + .triadic),
MCMC.prop.weights = "default",
MCMC.prop.args = list(),
MCMC.interval = NULL,
MCMC.burnin = EVL(MCMC.interval * 16),
MCMC.samplesize = NULL,
MCMC.effectiveSize = NULL,
MCMC.effectiveSize.damp = 10,
MCMC.effectiveSize.maxruns = 16,
MCMC.effectiveSize.burnin.pval = 0.2,
MCMC.effectiveSize.burnin.min = 0.05,
MCMC.effectiveSize.burnin.max = 0.5,
MCMC.effectiveSize.burnin.nmin = 16,
MCMC.effectiveSize.burnin.nmax = 128,
MCMC.effectiveSize.burnin.PC = FALSE,
MCMC.effectiveSize.burnin.scl = 32,
MCMC.effectiveSize.order.max = NULL,
MCMC.return.stats = 2^12,
MCMC.runtime.traceplot = FALSE,
MCMC.maxedges = Inf,
MCMC.addto.se = TRUE,
MCMC.packagenames = c(),
SAN.maxit = 4,
SAN.nsteps.times = 8,
SAN = control.san(term.options = term.options, SAN.maxit = SAN.maxit, SAN.prop =
MCMC.prop, SAN.prop.weights = MCMC.prop.weights, SAN.prop.args = MCMC.prop.args,
SAN.nsteps = EVL(MCMC.burnin, 16384) * SAN.nsteps.times, SAN.samplesize =
EVL(MCMC.samplesize, 1024), SAN.packagenames = MCMC.packagenames, parallel =
parallel, parallel.type = parallel.type, parallel.version.check =
parallel.version.check),
MCMLE.termination = c("confidence", "Hummel", "Hotelling", "precision", "none"),
MCMLE.maxit = 60,
MCMLE.conv.min.pval = 0.5,
MCMLE.confidence = 0.99,
MCMLE.confidence.boost = 2,
MCMLE.confidence.boost.threshold = 1,
MCMLE.confidence.boost.lag = 4,
MCMLE.NR.maxit = 100,
MCMLE.NR.reltol = sqrt(.Machine$double.eps),
obs.MCMC.mul = 1/4,
obs.MCMC.samplesize.mul = sqrt(obs.MCMC.mul),
obs.MCMC.samplesize = EVL(round(MCMC.samplesize * obs.MCMC.samplesize.mul)),
obs.MCMC.effectiveSize = NVL3(MCMC.effectiveSize, . * obs.MCMC.mul),
obs.MCMC.interval.mul = sqrt(obs.MCMC.mul),
obs.MCMC.interval = EVL(round(MCMC.interval * obs.MCMC.interval.mul)),
obs.MCMC.burnin.mul = sqrt(obs.MCMC.mul),
obs.MCMC.burnin = EVL(round(MCMC.burnin * obs.MCMC.burnin.mul)),
obs.MCMC.prop = MCMC.prop,
obs.MCMC.prop.weights = MCMC.prop.weights,
obs.MCMC.prop.args = MCMC.prop.args,
obs.MCMC.impute.min_informative = function(nw) network.size(nw)/4,
obs.MCMC.impute.default_density = function(nw) 2/network.size(nw),
MCMLE.min.depfac = 2,
MCMLE.sampsize.boost.pow = 0.5,
MCMLE.MCMC.precision = if (startsWith("confidence", MCMLE.termination[1])) 0.1 else
0.005,
MCMLE.MCMC.max.ESS.frac = 0.1,
MCMLE.metric = c("lognormal", "logtaylor", "Median.Likelihood", "EF.Likelihood",
"naive"),
MCMLE.method = c("BFGS", "Nelder-Mead"),
MCMLE.dampening = FALSE,
MCMLE.dampening.min.ess = 20,
MCMLE.dampening.level = 0.1,
MCMLE.steplength.margin = 0.05,
MCMLE.steplength = NVL2(MCMLE.steplength.margin, 1, 0.5),
MCMLE.steplength.parallel = c("observational", "never"),
MCMLE.sequential = TRUE,
MCMLE.density.guard.min = 10000,
MCMLE.density.guard = exp(3),
MCMLE.effectiveSize = 64,
obs.MCMLE.effectiveSize = NULL,
MCMLE.interval = 1024,
MCMLE.burnin = MCMLE.interval * 16,
MCMLE.samplesize.per_theta = 32,
MCMLE.samplesize.min = 256,
MCMLE.samplesize = NULL,
obs.MCMLE.samplesize.per_theta = round(MCMLE.samplesize.per_theta *
obs.MCMC.samplesize.mul),
obs.MCMLE.samplesize.min = 256,
obs.MCMLE.samplesize = NULL,
obs.MCMLE.interval = round(MCMLE.interval * obs.MCMC.interval.mul),
obs.MCMLE.burnin = round(MCMLE.burnin * obs.MCMC.burnin.mul),
MCMLE.steplength.solver = c("glpk", "lpsolve"),
MCMLE.last.boost = 4,
MCMLE.steplength.esteq = TRUE,
MCMLE.steplength.miss.sample = function(x1) c(max(ncol(rbind(x1)) * 2, 30), 10),
MCMLE.steplength.min = 1e-04,
MCMLE.effectiveSize.interval_drop = 2,
MCMLE.save_intermediates = NULL,
MCMLE.nonvar = c("message", "warning", "error"),
MCMLE.nonident = c("warning", "message", "error"),
MCMLE.nonident.tol = 1e-10,
SA.phase1_n = function(q, ...) max(200, 7 + 3 * q),
SA.initial_gain = 0.1,
SA.nsubphases = 4,
SA.min_iterations = function(q, ...) (7 + q),
SA.max_iterations = function(q, ...) (207 + q),
SA.phase3_n = 1000,
SA.interval = 1024,
SA.burnin = SA.interval * 16,
SA.samplesize = 1024,
CD.samplesize.per_theta = 128,
obs.CD.samplesize.per_theta = 128,
CD.nsteps = 8,
CD.multiplicity = 1,
CD.nsteps.obs = 128,
CD.multiplicity.obs = 1,
CD.maxit = 60,
CD.conv.min.pval = 0.5,
CD.NR.maxit = 100,
CD.NR.reltol = sqrt(.Machine$double.eps),
CD.metric = c("naive", "lognormal", "logtaylor", "Median.Likelihood", "EF.Likelihood"),
CD.method = c("BFGS", "Nelder-Mead"),
CD.dampening = FALSE,
CD.dampening.min.ess = 20,
CD.dampening.level = 0.1,
CD.steplength.margin = 0.5,
CD.steplength = 1,
CD.adaptive.epsilon = 0.01,
CD.steplength.esteq = TRUE,
CD.steplength.miss.sample = function(x1) ceiling(sqrt(ncol(rbind(x1)))),
CD.steplength.min = 1e-04,
CD.steplength.parallel = c("observational", "always", "never"),
CD.steplength.solver = c("glpk", "lpsolve"),
loglik = control.logLik.ergm(),
term.options = NULL,
seed = NULL,
parallel = 0,
parallel.type = NULL,
parallel.version.check = TRUE,
parallel.inherit.MT = FALSE,
...
)
A list with arguments as components.
Logical: If TRUE, terms whose observed statistic values are at the extremes of their possible ranges are dropped from the fit and their corresponding parameter estimates are set to plus or minus infinity, as appropriate. This is done because maximum likelihood estimates cannot exist when the vector of observed statistic lies on the boundary of the convex hull of possible statistic values.
numeric or NA
vector equal in length to the number of
parameters in the model or NULL
(the default); the initial values for
the estimation and coefficient offset terms. If NULL
is passed, all
of the initial values are computed using the method specified by
control$init.method
. If a numeric vector is
given, the elements of the vector are interpreted as follows:
Elements corresponding to terms enclosed in offset()
are used as
the fixed offset coefficients. Note that offset coefficients alone can be
more conveniently specified using ergm()
argument
offset.coef
. If both offset.coef
and init
arguments are
given, values in offset.coef
will take precedence.
Elements that do not correspond to offset terms and are not NA
are used as starting values in the estimation.
Initial values for the elements that are NA
are fit using the
method specified by control$init.method
.
Passing control.ergm(init=coef(prev.fit))
can be used to
``resume'' an uncoverged ergm()
run, though checkpoint
and
`resume` would be better under most circumstances.
A chatacter vector or NULL
. The default
method depends on the reference measure used. For the binary
("Bernoulli"
) ERGMs, with dyad-independent constraints,
it's maximum pseudo-likelihood estimation (MPLE). Other valid
values include "zeros"
for a 0
vector of
appropriate length and "CD"
for contrastive divergence. If
passed explicitly, this setting overrides the reference's
limitations.
Valid initial methods for a given reference are set by the
InitErgmReference.*
function.
One of "MCMLE" (default) or
"Stochastic-Approximation". Chooses the estimation method used
to find the MLE. MCMLE
attempts to maximize an
approximation to the log-likelihood function.
Stochastic-Approximation
are both stochastic approximation
algorithms that try to solve the method of moments equation that
yields the MLE in the case of an exponential family model. The
direct use of the likelihood function has many theoretical
advantages over stochastic approximation, but the choice will
depend on the model and data being fit. See Handcock (2000) and
Hunter and Handcock (2006) for details.
Logical: If TRUE, then force MCMC-based estimation method, even if the exact MLE can be computed via maximum pseudolikelihood estimation.
Logical: If TRUE, then an approximate Hessian matrix is used in the MCMC-based estimation method.
At the start of every iteration, save the state
of the optimizer in a way that will allow it to be resumed. The
name is passed through sprintf()
with iteration number as the
second argument. (For example, checkpoint="step_%03d.RData"
will save to step_001.RData
, step_002.RData
, etc.)
If given a file name of an RData
file produced by
checkpoint
, the optimizer will attempt to resume after
restoring the state. Control parameters from the saved state will
be reused, except for those whose value passed via
control.ergm()
had change from the saved run. Note that if the
network, the model, or some critical settings differ between
runs, the results may be undefined.
These parameters control the maximum number of dyads (potential ties) that will be used by the MPLE to construct the predictor matrix for its logistic regression. In general, the algorithm visits dyads in a systematic sample that, if it does not hit one of these limits, will visit every informative dyad. If a limit is exceeded, case-control approximation to the likelihood, comprising all edges and those non-edges that have been visited by the algorithm before the limit was exceeded will be used.
MPLE.samplesize
limits the number of dyads visited, unless the
MPLE is being computed for the purpose of being the initial value
for MCMC-based estimation, in which case init.MPLE.samplesize
is used instead, All of these can be specified either as numbers or as
function(d,e)
taking the number of informative dyads and
informative edges. Specifying or returning a larger number than
the number of informative dyads is safe.
One of "glm"
, "penalized"
, or
"logitreg"
. Chooses method of calculating MPLE. "glm"
is the
usual formal logistic regression called via glm()
, whereas
"penalized"
uses the bias-reduced method of Firth (1993) as
originally implemented by Meinhard Ploner, Daniela Dunkler, Harry
Southworth, and Georg Heinze in the "logistf" package. "logitreg"
is
an "in-house" implementation that is slower and probably less stable but
supports nonlinear logistic regression. It is invoked automatically when the
model has curved terms.
Maximum number of iterations for "logitreg"
implementation of MPLE.
A rudimentary nonidentifiability/multicollinearity diagnostic. If
MPLE.nonident.tol > 0
, test the MPLE covariate matrix or the CD
statistics matrix has linearly dependent columns via QR decomposition with tolerance MPLE.nonident.tol
. This is
often (not always) indicative of a non-identifiable
(multicollinear) model. If nonidentifiable, depending on
MPLE.nonident
issue a warning, an error, or a message
specifying the potentially redundant statistics. Before the
diagnostic is performed, covariates that do not vary (i.e.,
all-zero columns) are dropped, with their handling controlled by
MPLE.nonvar
. The corresponding MCMLE.*
arguments provide a
similar diagnostic for the unconstrained MCMC sample's estimating
functions.
Controls for estimating the MPLE covariance
matrix. MPLE.covariance method
determines the method, with
invHess
(the default) returning the covariance estimate
obtained from the glm()
. Godambe
estimates the covariance
matrix using the Godambe-matrix ScHu23cergm. This
method is recommended for dyad-dependent models. Alternatively,
bootstrap
estimates standard deviations using a parametric
bootstrapping approach @see @ScDe17eergm. The
other parameters control, respectively, the number of networks to
simulate, the MCMC burn-in, and the MCMC interval for Godambe
and bootstrap
methods.
If TRUE
(the default), perform the MPLE
existence check described by ScHu23c;textualergm.
If TRUE
, MPLE will ignore all
dyad-independent constraints except for those due to attributes
missingness. This can be used to avert evaluating and storing the
rlebdm
s for very large networks except where absolutely
necessary. Note that this can be very dangerous unless you know
what you are doing.
Specifies the proposal (directly) and/or
a series of "hints" about the structure of the model being
sampled. The specification is in the form of a one-sided formula
with hints separated by +
operations. If the LHS exists and is
a string, the proposal to be used is selected directly.
A common and default "hint" is ~sparse
, indicating
that the network is sparse and that the sample should put roughly
equal weight on selecting a dyad with or without a tie as a
candidate for toggling.
Specifies the proposal
distribution used in the MCMC Metropolis-Hastings algorithm. Possible
choices depending on selected reference
and constraints
arguments of the ergm()
function, but often include "TNT"
and "random"
, and the "default"
is to use the one with the
highest priority available.
An alternative, direct way of specifying additional arguments to proposal.
Number of proposals between sampled statistics. Increasing interval will reduces the autocorrelation in the sample, and may increase the precision in estimates by reducing MCMC error, at the expense of time. Set the interval higher for larger networks.
Number of proposals before any MCMC sampling is done. It typically is set to a fairly large number.
Number of network statistics, randomly drawn from a given distribution on the set of all networks, returned by the Metropolis-Hastings algorithm. Increasing sample size may increase the precision in the estimates by reducing MCMC error, at the expense of time. Set it higher for larger networks, or when using parallel functionality.
Set MCMC.effectiveSize
to a non-NULL value to adaptively
determine the burn-in and the MCMC length needed to get the
specified effective size; 50 is a reasonable value. In the
adaptive MCMC mode, MCMC is run forward repeatedly
(MCMC.samplesize*MCMC.interval
steps, up to
MCMC.effectiveSize.maxruns
times) until the target effective
sample size is reached or exceeded.
After each run, the returned statistics are mapped to the
estimating function scale, then an exponential decay model is fit
to the scaled statistics to find that burn-in which would reduce
the difference between the initial values of statistics and their
equilibrium values by a factor of MCMC.effectiveSize.burnin.scl
of what it initially was, bounded by MCMC.effectiveSize.min
and
MCMC.effectiveSize.max
as proportions of sample size. If the
best-fitting decay exceeds MCMC.effectiveSize.max
, the
exponential model is considered to be unsuitable and
MCMC.effectiveSize.min
is used.
A Geweke diagnostic is then run, after thinning the sample to
MCMC.effectiveSize.burnin.nmax
. If this Geweke diagnostic
produces a \(p\)-value higher than
MCMC.effectiveSize.burnin.pval
, it is accepted.
If MCMC.effectiveSize.burnin.PC>0
, instead of using the full
sample for burn-in estimation, at most this many principal
components are used instead.
The effective size of the post-burn-in sample is computed via
VaFl15m;textualergm, and compared to the target
effective size. If it is not matched, the MCMC run is resumed,
with the additional draws needed linearly extrapolated but
weighted in favor of the baseline MCMC.samplesize
by the
weighting factor MCMC.effectiveSize.damp
(higher = less
damping). Lastly, if after an MCMC run, the number of samples
equals or exceeds 2*MCMC.samplesize
, the chain will be thinned
by 2 until it falls below that, while doubling
MCMC.interval
. MCMC.effectiveSize.order.max
can be used to
set the order of the AR model used to estimate the effective
sample size and the variance for the Geweke diagnostic.
Lastly, if MCMC.effectiveSize
is a matrix, say, \(W\), it
will be treated as a target precision (inverse-variance) matrix.
If \(V\) is the sample covariance matrix, the target effective
size \(n_{\text{eff}}\) will be set such that
\(V/n_{\text{eff}}\) is close to \(W\) in magnitude,
specifically that \(\operatorname{tr}((V/n_{\text{eff}})W)/p\approx 1\).
Numeric: If positive, include an
mcmc.list
(two, if observational process was involved) of
MCMC network statistics from the last iteration of network of the
estimation. They will be thinned to have length of at most
MCMC.return.stats
. They are used for MCMC diagnostics.
Logical: If TRUE
, plot traceplots of the MCMC
sample after every MCMC MLE iteration.
The maximum number of edges that may occur during the MCMC sampling. If this number is exceeded at any time, sampling is stopped immediately.
Whether to add the standard errors induced by the MCMC algorithm to the estimates' standard errors.
Names of packages in which to look for change statistic functions in addition to those autodetected. This argument should not be needed outside of very strange setups.
When target.stats
argument is passed to
ergm()
, the maximum number of attempts to use san()
to obtain a network with statistics close to those specified.
Multiplier for SAN.nsteps
relative to
MCMC.burnin
. This lets one control the amount of SAN burn-in
(arguably, the most important of SAN parameters) without overriding the
other SAN
defaults.
Control arguments to san()
. See
control.san()
for details.
The criterion used for terminating MCMLE estimation:
"Hummel"
Terminate when the Hummel step length is
1 for two consecutive iterations. For the last iteration, the sample size is
boosted by a factor of MCMLE.last.boost
. See Hummel et. al. (2012).
Note that this criterion is incompatible with MCMLE.steplength
\(\ne\) 1 or MCMLE.steplength.margin
\(=\) NULL
.
"Hotelling"
After every MCMC sample, an autocorrelation-adjusted
Hotelling's T^2 test for equality of MCMC-simulated network statistics to
observed is conducted, and if its P-value exceeds
MCMLE.conv.min.pval
, the estimation is considered to have converged
and finishes. This was the default option in ergm
version 3.1.
"precision"
Terminate when the estimated loss in estimating precision
due to using MCMC standard errors is below the precision bound specified by
MCMLE.MCMC.precision
, and the Hummel step length is 1 for two
consecutive iterations. See MCMLE.MCMC.precision
for details. This
feature is in experimental status until we verify the coverage of the
standard errors.
Note that this criterion is incompatible with \(\code{MCMLE.steplength}\ne 1\) or \(\code{MCMLE.steplength.margin}=\code{NULL}\).
"confidence"
: Performs an equivalence test to prove with level
of confidence MCMLE.confidence
that the true value of the
deviation of the simulated mean value parameter from the observed
is within an ellipsoid defined by the inverse-variance-covariance
of the sufficient statistics multiplied by a scaling factor
control$MCMLE.MCMC.precision
(which has a different default).
"none"
Stop after
MCMLE.maxit
iterations.
Maximum number of times the parameter for the MCMC should be updated by maximizing the MCMC likelihood. At each step the parameter is changed to the values that maximizes the MCMC likelihood based on the current sample.
The P-value used in the Hotelling test for early termination.
The confidence level for declaring
convergence for "confidence"
methods.
The maximum increase factor in sample
size (or target effective size, if enabled) when the
"confidence"
termination criterion is either not approaching
the tolerance region or is unable to prove convergence.
Sample size or target effective size will be increaed if the distance from the tolerance region fails to decrease more than MCMLE.confidence.boost.threshold in this many successive iterations.
The method, maximum number of
iterations and relative tolerance to use within the optim
rountine in
the MLE optimization. Note that by default, ergm uses trust
, and
falls back to optim
only when trust
fails.
Corresponding MCMC parameters and settings used for the constrained sample when
unobserved data are present in the estimation routine. By default, they are controlled by the *.mul
parameters, as fractions of the corresponding settings for the
unconstrained (standard) MCMC.
These can, in turn, be controlled by obs.MCMC.mul
, which can be
used to set the overal multiplier for the number of MCMC steps in
the constrained sample; one half of its effect applies to the
burn-in and interval and the other half to the total sample
size. For example, for obs.MCMC.mul=1/4
(the default),
obs.MCMC.samplesize
is set to \(\sqrt{1/4}=1/2\) that of
obs.MCMC.samplesize
, and obs.MCMC.burnin
and
obs.MCMC.interval
are set to \(\sqrt{1/4}=1/2\) of their
respective unconstrained sampling parameters. When
MCMC.effectiveSize
or MCMLE.effectiveSize
are given, their
corresponding obs
parameters are set to them multiplied by
obs.MCMC.mul
.
Lastly, if MCMLE.effectiveSize
is not NULL but
obs.MCMLE.effectiveSize
is, the constrained sample's target
effective size is set adaptively to achieve a similar precision
for the estimating functions as that achieved for the
unconstrained.
Controls for imputation of missing dyads for initializing MCMC
sampling. If numeric, obs.MCMC.impute.min_informative
specifies
the minimum number dyads that need to be non-missing before
sample network density is used as the imputation density. It can
also be specified as a function that returns this
value. obs.MCMC.impute.default_density
similarly controls the
imputation density when number of non-missing dyads is too low.
When using adaptive MCMC effective size, and methods that increase the MCMC sample size, use MCMLE.sampsize.boost.pow
as the power of the boost amount (relative to the boost of the target effective size), but ensure that sample size is no less than MCMLE.min.depfac
times the target effective size.
MCMLE.MCMC.precision
is a vector of upper bounds on the standard
errors induced by the MCMC algorithm, expressed as a percentage of the total
standard error. The MCMLE algorithm will terminate when the MCMC standard
errors are below the precision bound, and the Hummel step length is 1 for
two consecutive iterations. This is an experimental feature.
If effective sample size is used (see MCMC.effectiveSize
), then ergm
may increase the target ESS to reduce the MCMC standard error.
Method to calculate the loglikelihood approximation. See Hummel et al (2010) for an explanation of "lognormal" and "naive".
Deprecated. By default, ergm uses trust
, and
falls back to optim
with Nelder-Mead method when trust
fails.
(logical) Should likelihood dampening be used?
The effective sample size below which dampening is used.
The proportional distance from boundary of the convex hull move.
The extra margin required for a Hummel step
to count as being inside the convex hull of the sample. Set this to 0 if
the step length gets stuck at the same value over several iteraions. Set it
to NULL
to use fixed step length. Note that this parameter is
required to be non-NULL
for MCMLE termination using Hummel or
precision criteria.
Multiplier for step length (on the mean-value parameter scale), which may (for values less than one) make fitting more stable at the cost of computational efficiency.
If MCMLE.steplength.margin
is not NULL
, the step
length will be set using the algorithm of Hummel et
al. (2010). In that case, it will serve as the maximum step
length considered. However, setting it to anything other than 1
will preclude using Hummel or precision as termination criteria.
Whether parallel multisection
search (as opposed to a bisection search) for the Hummel step
length should be used if running in multiple threads. Possible
values (partially matched) are "never"
, and
(default) "observational"
(i.e., when missing data MLE is
used).
Logical: If TRUE, the next iteration of the fit uses
the last network sampled as the starting network. If FALSE, always use the
initially passed network. The results should be similar (stochastically),
but the TRUE option may help if the target.stats
in the
ergm()
function are far from the initial network.
A simple heuristic to
stop optimization if it finds itself in an overly dense region, which
usually indicates ERGM degeneracy: if the sampler encounters a network
configuration that has more than MCMLE.density.guard.min
edges and
whose number of edges is exceeds the observed network by more than
MCMLE.density.guard
, the optimization process will be stopped with an
error.
Sets the corresponding MCMC.*
parameters when
main.method="MCMLE"
(the default). Used because defaults may be
different for different methods. MCMLE.samplesize.per_theta
controls the MCMC sample size (not target effective size) as a
function of the number of (curved) parameters in the model, and
MCMLE.samplesize.min
sets the minimum sample size regardless of
their number.
The linear program solver to use for
MCMLE step length calculation. Can be either "glpk"
to use
Rglpk or "lpsolve"
to use lpSolveAPI.
Rglpk can be orders of magnitude faster, particularly
for models with many parameters and with large sample sizes, so
it is used where available; but it requires an external library
to install under some operating systems, so fallback to
lpSolveAPI provided.
For the Hummel termination criterion, increase the MCMC sample size of the last iteration by this factor.
For curved ERGMs, should the estimating function values be used to compute the Hummel step length? This allows the Hummel stepping algorithm converge when some sufficient statistics are at 0.
In fitting the missing data MLE, the rules for step length become more complicated. In short, it is necessary for all points in the constrained sample to be in the convex hull of the unconstrained (though they may be on the border); and it is necessary for their centroid to be in its interior. This requires checking a large number of points against whether they are in the convex hull, so to speed up the procedure, a sample is taken of the points most likely to be outside it. This parameter specifies the sample size or a function of the unconstrained sample matrix to determine the sample size. If the parameter or the return value of the function has a length of 2, the first element is used as the sample size, and the second element is used in an early-termination heuristic, only continuing the tests until this many test points in a row did not yield a change in the step length.
Stops MCMLE estimation when the step length gets stuck below this minimum value.
Every iteration, after MCMC
sampling, save the MCMC sample and some miscellaneous information
to a file with this name. This is mainly useful for diagnostics
and debugging. The name is passed through sprintf()
with
iteration number as the second argument. (For example,
MCMLE.save_intermediates="step_%03d.RData"
will save to
step_001.RData
, step_002.RData
, etc.)
A constant or a function of number of free
parameters q
, number of free canonical statistic p
, and
network size n
, giving the number of MCMC samples to draw in
Phase 1 of the stochastic approximation algorithm. Defaults to
\(\max(200, 7+3p)\). See Snijders (2002) for details.
Initial gain to Phase 2 of the stochastic approximation algorithm. Defaults to 0.1. See Snijders (2002) for details.
Number of sub-phases in Phase 2 of the
stochastic approximation algorithm. Defaults to
MCMLE.maxit
. See Snijders (2002) for details.
A constant or a function
of number of free parameters q
, number of free canonical
statistic p
, and network size n
, giving the baseline numbers
of iterations within each subphase of Phase 2 of the stochastic
approximation algorithm. Default to \(7+p\) and \(207+p\),
respectively. See Snijders (2002) for details.
Sample size for the MCMC sample in Phase 3 of the stochastic approximation algorithm. See Snijders (2002) for details.
Sets the corresponding
MCMC.*
parameters when main.method="Stochastic-Approximation"
.
Miscellaneous tuning parameters of the CD sampler and
optimizer. These have the same meaning as their MCMLE.*
and
MCMC.*
counterparts.
Note that only the Hotelling's stopping criterion is implemented for CD.
Main settings for contrastive
divergence to obtain initial values for the estimation:
respectively, the number of Metropolis--Hastings steps to take
before reverting to the starting value and the number of
tentative proposals per step. Computational experiments indicate
that increasing CD.multiplicity
improves the estimate
faster than increasing CD.nsteps
--- up to a point --- but
it also samples from the wrong distribution, in the sense that
while as CD.nsteps
\(\rightarrow\infty\), the CD estimate
approaches the MLE, this is not the case for
CD.multiplicity
.
In practice, MPLE, when available, usually outperforms CD for
even a very high CD.nsteps
(which is, in turn, not very
stable), so CD is useful primarily when MPLE is not
available. This feature is to be considered experimental and in
flux.
The default values have been set experimentally, providing a reasonably stable, if not great, starting values.
When there are missing dyads,
CD.nsteps
and CD.multiplicity
must be set to a relatively high
value, as the network passed is not necessarily a good start for CD.
Therefore, these settings are in effect if there are missing dyads in the
observed network, using a higher default number of steps.
See control.ergm.bridge()
A list of additional arguments to be passed to term initializers. See ? term.options
.
Seed value (integer) for the random number generator. See
set.seed()
.
Number of threads in which to run the sampling. Defaults to
0 (no parallelism). See ergm-parallel
for details and troubleshooting.
API to use for parallel processing. Defaults
to using the parallel package with PSOCK clusters. See
ergm-parallel
.
Logical: If TRUE, check that the version of ergm running on the slave nodes is the same as that running on the master node.
Logical: If TRUE, slave nodes and
processes inherit the set.MT_terms()
setting.
A dummy argument to catch deprecated or mistyped control parameters.
Different estimation methods or components of estimation have
different efficient tuning parameters; and we generally want to use
the estimation controls to inform the simulation controls in
control.simulate.ergm()
. To accomplish this, control.ergm()
uses
method-specific controls, with the method identified by the prefix:
CD
Contrastive Divergence estimation Kr17uergm
MPLE
Maximum Pseudo-Likelihood Estimation StIk90pergm
MCMLE
Monte-Carlo MLE HuHa06i,HuHu12iergm
SA
Stochastic Approximation via Robbins--Monro RoMo51s,Sn02mergm
SAN
Simulated Annealing used when target.stats
are specified for ergm()
obs
Missing data MLE HaGi10mergm
init
Affecting how initial parameter guesses are obtained
parallel
Affecting parallel processing
MCMC
Low-level MCMC simulation controls
Corresponding MCMC
controls will usually be overwritten by the
method-specific ones. After the estimation finishes, they will
contain the last MCMC parameters used.
Firth (1993), Bias Reduction in Maximum Likelihood Estimates. Biometrika, 80: 27-38.
Kristoffer Sahlin. Estimating convergence of Markov chain Monte Carlo simulations. Master's Thesis. Stockholm University, 2011. https://www2.math.su.se/matstat/reports/master/2011/rep2/report.pdf
ergm()
. The control.simulate()
function
performs a similar function for simulate.ergm()
;
control.gof()
performs a similar function for gof()
.