ergmm: Fit a Latent Space Random Graph Model

Description

ergmm() is used to fit latent space and latent space cluster random network models, as described by Hoff, Raftery and Handcock (2002), Handcock, Raftery and Tantrum (2005), and Krivitsky, Handcock, Raftery, and Hoff (2009). ergmm() can return either a Bayesian model fit or the two-stage MLE.

Usage

ergmm(
  formula,
  response = NULL,
  family = "Bernoulli",
  fam.par = NULL,
  control = control.ergmm(),
  user.start = list(),
  prior = ergmm.prior(),
  tofit = c("mcmc", "mkl", "mkl.mbc", "procrustes", "klswitch"),
  Z.ref = NULL,
  Z.K.ref = NULL,
  seed = NULL,
  verbose = FALSE
)

Value

ergmm returns an object of class ergmm containing the information about the posterior.

Arguments

formula

An formula object, of the form g ~ <term 1> + <term 2> ..., where g is a network object or a matrix that can be coerced to a network object, and <term 1>, <term 2>, etc., are each terms for the model. See ergmTerm for the terms that can be fitted, though note the section on fixed effects below. To create a network object in , use the network function, then add nodal attributes to it using set.vertex.attribute if necessary.

Note that, as in lm(), the model will include an intercept term. This behavior can be overridden by including a -1 or +0 term in the formula, and a 1(mean=...,var=...) term can be used to set a prior different from default.

response

An optional edge attribute that serves as the response variable. By default, presence (1) or absence (0) of an edge in g is used.

family

A character vector specifying the conditional distribution of each edge value. See families.ergmm for the currently implemented families.

fam.par

For those families that require additional parameters, a list.

control

The MCMC parameters that do not affect the posterior distribution such as the sample size, the proposal variances, and tuning parameters, in the form of a named list. See control.ergmm for more information and defaults.

user.start

An optional initial configuration parameters for MCMC in the form of a list. By default, posterior mode conditioned on cluster assignments is used. It is permitted to only supply some of the parameters of a configuration. If this is done, the remaining paramters are fitted conditional on those supplied.

prior

The prior parameters for the model being fitted in the form of a named list. See term help for the terms to use. If given, will override those given in the formula terms, making it useful as a convenient way to store and reproduce a prior distribution. The list or prior parameters can also be extracted from an ERGMM fit object. See ergmm.prior for more information.

tofit

A character vector listing some subset of "pmode", "mcmc", "mkl", "mkl.mbc", "mle","procrustes", and "klswitch", defaulting to all of the above, instructing ergmm what should be returned as a part of the ERGMM fit object. Omiting can be used to skip particular steps in the fitting process. If the requested procedure or output depends on some other procedure or output not explictly listed, the dependency will be resolved automatically.

Z.ref

If given, used as a reference for Procrustes analysis.

Z.K.ref

If given, used as a reference for label-switching.

seed

If supplied, random number seed.

verbose

If this is TRUE (or 1), causes information to be printed out about the progress of the fitting, particularly initial value generation. Higher values lead to greater verbosity.

Specifying fixed effects

Each coefficient for a fixed effect covariate has a normal prior whose mean and variance are set by the mean and var parameters of the term. For those formula terms that add more than one covariate, a vector can be given for mean and variance. If not, the vectors given will be repeated until the needed length is reached.

ergmm can use model terms implemented for the ergm package and via the ergm.userterms API (in GitHub repository statnet/ergm.userterms). See ergmTerm for a list of available terms. If you wish to specify the prior mean and variance, you can add them to the call. E.g.,
TERMNAME(..., mean=0, var=9),
where ... are the arguments for the ergm term, will initialize TERMNAME with prior mean of 0 and prior variance of 9.

Some caveats:

ergm has a binary and a valued mode. Regardless of the family used, the binary variant of the ergm term will be used in the linear predictor of the model.
ergm does not support modeling self-loops, so terms imported in this way will always have predictor x[i,i]==0. This should not affect most situations, but if you absolutely must model self-loops and non-self-edges in one term, use the deprecated terms below.
latentnet only fits models with dyadic independence. Terms that induce dyadic dependence (e.g., triangles) can be used, but then the likelihood of the model will, effectively, be replaced with pseudolikelihood. (Note that under dyadic independence, the two are equal.)

References

Mark S. Handcock, Adrian E. Raftery and Jeremy Tantrum (2002). Model-Based Clustering for Social Networks. Journal of the Royal Statistical Society: Series A, 170(2), 301-354.

Peter D. Hoff, Adrian E. Raftery and Mark S. Handcock (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460), 1090-1098.

Pavel N. Krivitsky, Mark S. Handcock, Adrian E. Raftery, and Peter D. Hoff (2009). Representing degree distributions, clustering, and homophily in social networks with latent cluster random effects models. Social Networks, 31(3), 204-213.

Pavel N. Krivitsky and Mark S. Handcock (2008). Fitting Position Latent Cluster Models for Social Networks with latentnet. Journal of Statistical Software, 24(5). tools:::Rd_expr_doi("10.18637/jss.v024.i05")

Examples

Run this code


# \donttest{
#
# Use 'data(package = "latentnet")' to list the data sets in a
#
data(package="latentnet")
#
# Using Sampson's Monk data, lets fit a 
# simple latent position model
#
data(sampson)
samp.fit <- ergmm(samplike ~ euclidean(d=2))
#
# See if we have convergence in the MCMC
mcmc.diagnostics(samp.fit)
#
# Plot the fit
#
plot(samp.fit)
#
# Using Sampson's Monk data, lets fit a latent clustering random effects model
#
samp.fit2 <- ergmm(samplike ~ euclidean(d=2, G=3)+rreceiver)
#
# See if we have convergence in the MCMC
mcmc.diagnostics(samp.fit2)
#
# Plot the fit.
#
plot(samp.fit2, pie=TRUE)
# }

Run the code above in your browser using DataLab