The following description is an exegesis of section 4 of Krivitsky
et al. (2022).
Let \(\mathbf{g}\) be a vector of target statistics for the
network we wish to construct. That is, we are given an arbitrary network
\(\mathbf{y}^0 \in \mathcal{Y}\), and we seek a network
\(\mathbf{y} \in \mathcal{Y}\) such that
\(\mathbf{g}(\mathbf{y}) \approx \mathbf{g}\) -- ideally equality is achieved,
but in practice we may have to settle for a close approximation. The
variant of simulated annealing is as follows.
The energy function is defined
$$E_W (\mathbf{y}) = (\mathbf{g}(\mathbf{y}) - \mathbf{g})^\mathsf{T} W (\mathbf{g}(\mathbf{y}) - \mathbf{g}),$$
with \(W\) a symmetric positive (barring multicollinearity in statistics)
definite matrix of weights. This function achieves 0 only if the target is
reached. A good choice of this matrix yields a more efficient search.
A standard simulated annealing loop is used, as described below, with some
modifications. In particular, we allow the user to specify a vector of
offsets \(\eta\) to bias the annealing, with \(\eta_k = 0\)
denoting no offset. Offsets can be used with SAN to forbid certain
statistics from ever increasing or decreasing. As with ergm()
, offset
terms are specified using the offset()
decorator and their coefficients
specified with the offset.coef
argument. By default, finite offsets are
ignored by, but this can be overridden by setting the control.san()
argument SAN.ignore.finite.offsets = FALSE
.
The number of simulated annealing runs is specified by the SAN.maxit
control parameter and the initial value of the temperature \(T\) is set
to SAN.tau
. The value of \(T\) decreases linearly until \(T = 0\)
at the last run, which implies that all proposals that increase
\(E_W (\mathbf{y})\) are rejected. The weight matrix \(W\)
is initially set to \(I_p / p\), where \(I_p\) is the identity matrix
of an appropriate dimension. For weight \(W\) and temperature \(T\),
the simulated annealing iteration proceeds as follows:
Test if \(E_W(\mathbf{y}) = 0\). If so, then exit.
Generate a perturbed network \(\mathbf{y^*}\) from a proposal that
respects the model constraints. (This is typically the same proposal as
that used for MCMC.)
Store the quantity
\(\mathbf{g}(\mathbf{y^*}) - \mathbf{g}(\mathbf{y})\)
for later use.
Calculate acceptance probability
$$\alpha = \exp[ - (E_W (\mathbf{y^*}) - E_W (\mathbf{y})) / T + \eta^\mathsf{T} (\mathbf{g}(\mathbf{y^*}) - \mathbf{g}(\mathbf{y}))]$$
(If \(|\eta_k| = \infty\) and \(g_k (\mathbf{y^*}) - g_k (\mathbf{y}) = 0\), their product is defined to be 0.)
Replace \(\mathbf{y}\) with \(\mathbf{y^*}\) with probability
\(\min(1, \alpha)\).
After the specified number of iterations, \(T\) is updated as described
above, and \(W\) is recalculated by first computing a matrix \(S\), the
sample covariance matrix of the proposed differences stored in Step 3
(i.e., whether or not they were rejected), then
\(W = S^+ / tr(S^+)\), where \(S^+\) is the
Moore–Penrose pseudoinverse of \(S\) and \(tr(S^+)\) is the
trace of \(S^+\). The differences in Step 3 closely reflect the
relative variances and correlations among the network statistics.
In Step 2, the many options for MCMC proposals can provide for effective
means of speeding the SAN algorithm's search for a viable network.