simulate.ergm: Draw from the distribution of an Exponential Family Random Graph Model

Description

simulate is used to draw from exponential family random network models. See ergm() for more information on these models.

The method for ergm objects inherits the model, the coefficients, the response attribute, the reference, the constraints, and most simulation parameters from the model fit, unless overridden by passing them explicitly. Unless overridden, the simulation is initialized with either a random draw from near the fitted model saved by ergm() or, if unavailable, the network to which the ERGM was fit.

Usage

# S3 method for formula_lhs_network
simulate(object, nsim = 1, seed = NULL, ...)
simulate_formula(object, ..., basis = eval_lhs.formula(object))
# S3 method for network
simulate_formula(
  object,
  nsim = 1,
  seed = NULL,
  coef,
  response = NULL,
  reference = ~Bernoulli,
  constraints = ~.,
  observational = FALSE,
  monitor = NULL,
  statsonly = FALSE,
  esteq = FALSE,
  output = c("network", "stats", "edgelist", "ergm_state"),
  simplify = TRUE,
  sequential = TRUE,
  control = control.simulate.formula(),
  verbose = FALSE,
  ...,
  basis = ergm.getnetwork(object),
  do.sim = NULL,
  return.args = NULL
)
# S3 method for ergm_state
simulate_formula(
  object,
  nsim = 1,
  seed = NULL,
  coef,
  response = NULL,
  reference = ~Bernoulli,
  constraints = ~.,
  observational = FALSE,
  monitor = NULL,
  statsonly = FALSE,
  esteq = FALSE,
  output = c("network", "stats", "edgelist", "ergm_state"),
  simplify = TRUE,
  sequential = TRUE,
  control = control.simulate.formula(),
  verbose = FALSE,
  ...,
  basis = ergm.getnetwork(object),
  do.sim = NULL,
  return.args = NULL
)
# S3 method for ergm_model
simulate(
  object,
  nsim = 1,
  seed = NULL,
  coef,
  reference = if (is(constraints, "ergm_proposal")) NULL else trim_env(~Bernoulli),
  constraints = trim_env(~.),
  observational = FALSE,
  monitor = NULL,
  basis = NULL,
  esteq = FALSE,
  output = c("network", "stats", "edgelist", "ergm_state"),
  simplify = TRUE,
  sequential = TRUE,
  control = control.simulate.formula(),
  verbose = FALSE,
  ...,
  do.sim = NULL,
  return.args = NULL
)
# S3 method for ergm_state_full
simulate(
  object,
  nsim = 1,
  seed = NULL,
  coef,
  esteq = FALSE,
  output = c("network", "stats", "edgelist", "ergm_state"),
  simplify = TRUE,
  sequential = TRUE,
  control = control.simulate.formula(),
  verbose = FALSE,
  ...,
  return.args = NULL
)
# S3 method for ergm
simulate(
  object,
  nsim = 1,
  seed = NULL,
  coef = coefficients(object),
  response = object$network %ergmlhs% "response",
  reference = object$reference,
  constraints = list(object$constraints, object$obs.constraints),
  observational = FALSE,
  monitor = NULL,
  basis = if (observational) object$network else NVL(object$newnetwork, object$network),
  statsonly = FALSE,
  esteq = FALSE,
  output = c("network", "stats", "edgelist", "ergm_state"),
  simplify = TRUE,
  sequential = TRUE,
  control = control.simulate.ergm(),
  verbose = FALSE,
  ...,
  return.args = NULL
)

Value

If output=="stats" an mcmc object containing the simulated network statistics. If control$parallel>0, an mcmc.list object. If simplify=TRUE (the default), these would then be "stacked" and converted to a standard matrix. A logical vector indicating whether or not the term had come from the monitor= formula is stored in attr()-style attribute "monitored".

Otherwise, a representation of the simulated network is returned, in the form specified by output. In addition to a network representation or a list thereof, they have the following attr()-style attributes:

formula: The formula used to generate the sample.
stats: An mcmc or mcmc.list object as above.
control: Control parameters used to generate the sample.
constraints: Constraints used to generate the sample.
reference: The reference measure for the sample.
monitor: The monitoring formula.
response: The edge attribute used as a response.

The following are the permitted network formats:

"network": If nsim==1, an object of class network. If nsim>1, it returns an object of class network.list (a list of networks) with the above-listed additional attributes.
"edgelist": An edgelist representation of the network, or a list thereof, depending on nsim.
"ergm_state": A semi-internal representation of a network consisting of a network object emptied of edges, with an attached edgelist matrix, or a list thereof, depending on nsim.

If simplify==FALSE, the networks are returned as a nested list, with outer list being the parallel chain (including 1 for no parallelism) and inner list being the samples within that chains (including 1, if one network per chain). If TRUE, they are concatenated, and if a total of one network had been simulated, the network itself will be returned.

Arguments

object

Either a formula or an ergm object. The formula should be of the form y ~ <model terms>, where y is a network object or a matrix that can be coerced to a network object. For the details on the possible <model terms>, see ergmTerm. To create a network object in , use the network() function, then add nodal attributes to it using the %v% operator if necessary.

nsim

Number of networks to be randomly drawn from the given distribution on the set of all networks, returned by the Metropolis-Hastings algorithm.

seed

Seed value (integer) for the random number generator. See set.seed().

...

Further arguments passed to or used by methods.

basis

a value (usually a network) to override the LHS of the formula.

coef

Vector of parameter values for the model from which the sample is to be drawn. If object is of class ergm, the default value is the vector of estimated coefficients. Can be set to NULL to bypass, but only if return.args below is used.

response

Either a character string, a formula, or NULL (the default), to specify the response attributes and whether the ERGM is binary or valued. Interpreted as follows:

NULL

Model simple presence or absence, via a binary ERGM.

character string

The name of the edge attribute whose value is to be modeled. Type of ERGM will be determined by whether the attribute is logical (TRUE/FALSE) for binary or numeric for valued.

a formula

must be of the form NAME~EXPR|TYPE (with | being literal). EXPR is evaluated in the formula's environment with the network's edge attributes accessible as variables. The optional NAME specifies the name of the edge attribute into which the results should be stored, with the default being a concise version of EXPR. Normally, the type of ERGM is determined by whether the result of evaluating EXPR is logical or numeric, but the optional TYPE can be used to override by specifying a scalar of the type involved (e.g., TRUE for binary and 1 for valued).

reference

A one-sided formula specifying the reference measure ($h(y)$) to be used. See help for ERGM reference measures implemented in the ergm package.

constraints

A formula specifying one or more constraints on the support of the distribution of the networks being modeled. Multiple constraints may be given, separated by “+” and “-” operators. See ergmConstraint for the detailed explanation of their semantics and also for an indexed list of the constraints visible to the ergm package.

The default is to have no constraints except those provided through the ergmlhs API.

Together with the model terms in the formula and the reference measure, the constraints define the distribution of networks being modeled.

It is also possible to specify a proposal function directly either by passing a string with the function's name (in which case, arguments to the proposal should be specified through the MCMC.prop.args argument to the relevant control function, or by giving it on the LHS of the hints formula to MCMC.prop argument to the control function. This will override the one chosen automatically.

Note that not all possible combinations of constraints and reference measures are supported. However, for relatively simple constraints (i.e., those that simply permit or forbid specific dyads or sets of dyads from changing), arbitrary combinations should be possible.

observational

Inherit observational constraints rather than model constraints.

monitor

A one-sided formula specifying one or more terms whose value is to be monitored. These terms are appended to the model, along with a coefficient of 0, so their statistics are returned. An ergm_model objectcan be passed as well.

statsonly

Logical: If TRUE, return only the network statistics, not the network(s) themselves. Deprecated in favor of output=.

esteq

Logical: If TRUE, compute the sample estimating equations of an ERGM: if the model is non-curved, all non-offset statistics are returned either way, but if the model is curved, the score estimating function values (3.1) by Hunter and Handcock (2006) are returned instead.

output

Normally character, one of "network" (default), "stats", "edgelist", or "ergm_state": determines the output format. Partial matching is performed.

Alternatively, a function with prototype function(ergm_state, chain, iter, ...) that is called for each returned network, and its return value, rather than the network itself, is stored. This can be used to, for example, store the simulated networks to disk without storing them in memory or compute network statistics not implemented using the ERGM API, without having to store the networks themselves.

simplify

Logical: If TRUE the output is "simplified": sampled networks are returned in a single list, statistics from multiple parallel chains are stacked, etc.. This makes it consistent with behavior prior to ergm 3.10.

sequential

Logical: If FALSE, each of the nsim simulated Markov chains begins at the initial network. If TRUE, the end of one simulation is used as the start of the next. Irrelevant when nsim=1.

control

A list of control parameters for algorithm tuning, typically constructed with control.simulate.ergm() or control.simulate.formula(), which have different defaults. Their documentation gives the the list of recognized control parameters and their meaning. The more generic utility snctrl() (StatNet ConTRoL) also provides argument completion for the available control functions and limited argument name checking.

verbose

A logical or an integer to control the amount of progress and diagnostic information to be printed. FALSE/0 produces minimal output, with higher values producing more detail. Note that very high values (5+) may significantly slow down processing.

do.sim

Logical; a deprecated interface superseded by return.args, that saves the inputs to the next level of the function.

return.args

Character; if not NULL, the simulate method for that particular class will, instead of proceeding for simulation, instead return its arguments as a list that can be passed as a second argument to do.call() or a lower-level function such as ergm_MCMC_sample(). This can be useful if, for example, one wants to run several simulations with varying coefficients and does not want to reinitialize the model and the proposal every time. Valid inputs at this time are "formula", "ergm_model", and one of the "ergm_state" classes, for the three respective stopping points.

Functions

simulate(ergm_state_full): a low-level function to simulate from an ergm_state object.

Details

A sample of networks is randomly drawn from the specified model. The model is specified by the first argument of the function. If the first argument is a formula then this defines the model. If the first argument is the output of a call to ergm() then the model used for that call is the one fit -- and unless coef is specified, the sample is from the MLE of the parameters. If neither of those are given as the first argument then a Bernoulli network is generated with the probability of ties defined by prob or coef.

Note that the first network is sampled after burnin steps, and any subsequent networks are sampled each interval steps after the first.

More information can be found by looking at the documentation of ergm().

Examples

Run this code

# \dontshow{
options(ergm.eval.loglik=FALSE)
# }
#
# Let's draw from a Bernoulli model with 16 nodes
# and density 0.5 (i.e., coef = c(0,0))
#
g.sim <- simulate(network(16) ~ edges + mutual, coef=c(0, 0))
#
# What are the statistics like?
#
summary(g.sim ~ edges + mutual)
#
# Now simulate a network with higher mutuality
#
g.sim <- simulate(network(16) ~ edges + mutual, coef=c(0,2))
#
# How do the statistics look?
#
summary(g.sim ~ edges + mutual)
#
# Let's draw from a Bernoulli model with 16 nodes
# and tie probability 0.1
#
g.use <- network(16,density=0.1,directed=FALSE)
#
# Starting from this network let's draw 3 realizations
# of a edges and 2-star network
#
g.sim <- simulate(~edges+kstar(2), nsim=3, coef=c(-1.8,0.03),
               basis=g.use, control=control.simulate(
                 MCMC.burnin=1000,
                 MCMC.interval=100))
g.sim
summary(g.sim)
#
# attach the Florentine Marriage data
#
data(florentine)
#
# fit an edges and 2-star model using the ergm function
#
gest <- ergm(flomarriage ~ edges + kstar(2))
summary(gest)
#
# Draw from the fitted model (statistics only), and observe the number
# of triangles as well.
#
g.sim <- simulate(gest, nsim=10, 
            monitor=~triangles, output="stats",
            control=control.simulate.ergm(MCMC.burnin=1000, MCMC.interval=100))
g.sim

# Custom output: store the edgecount (computed in R), iteration index, and chain index.
output.f <- function(x, iter, chain, ...){
  list(nedges = network.edgecount(as.network(x)),
       chain = chain, iter = iter)
}
g.sim <- simulate(gest, nsim=3,
            output=output.f, simplify=FALSE,
            control=control.simulate.ergm(MCMC.burnin=1000, MCMC.interval=100))
unclass(g.sim)

Run the code above in your browser using DataLab