hdist: Find the Hamming Distances Between Two or More Graphs

Description

hdist returns the Hamming distance between the labeled graphs g1 and g2 in set dat for dichotomous data, or else the absolute (manhattan) distance. If normalize is true, this distance is divided by its dichotomous theoretical maximum (conditional on |V(G)|).

Usage

hdist(dat, dat2=NULL, g1=NULL, g2=NULL, normalize=FALSE, 
    diag=FALSE, mode="digraph")

Arguments

dat

a stack of input graphs.

dat2

a second graph stack (optional).

a vector indicating which graphs to compare (by default, all elements of dat).

a vector indicating against which the graphs of g1 should be compared (by default, all graphs).

normalize

divide by the number of available dyads?

diag

boolean indicating whether or not the diagonal should be treated as valid data. Set this true if and only if the data can contain loops. diag is FALSE by default.

mode

string indicating the type of graph being evaluated. "digraph" indicates that edges should be interpreted as directed; "graph" indicates that edges are undirected. mode is set to "digraph" by default.

Value

A matrix of Hamming distances

Details

The Hamming distance between two labeled graphs \(G_1\) and \(G_2\) is equal to \(|\{e : (e \in E(G_1), e \not\in E(G_2)) \wedge (e \not\in E(G_1), e \in E(G_2))\}|\). In more prosaic terms, this may be thought of as the number of addition/deletion operations required to turn the edge set of \(G_1\) into that of \(G_2\). The Hamming distance is a highly general measure of structural similarity, and forms a metric on the space of graphs (simple or directed). Users should be reminded, however, that the Hamming distance is extremely sensitive to nodal labeling, and should not be employed directly when nodes are interchangeable. The structural distance (Butts and Carley (2001)), implemented in structdist, provides a natural generalization of the Hamming distance to the more general case of unlabeled graphs.

Null hypothesis testing for Hamming distances is available via cugtest, and qaptest; graphs which minimize the Hamming distances to all members of a graph set can be found by centralgraph. For an alternative means of comparing the similarity of graphs, consider gcor.

References

Banks, D., and Carley, K.M. (1994). “Metric Inference for Social Networks.” Journal of Classification, 11(1), 121-49.

Butts, C.T. and Carley, K.M. (2005). “Some Simple Algorithms for Structural Comparison.” Computational and Mathematical Organization Theory, 11(4), 291-305.

Butts, C.T., and Carley, K.M. (2001). “Multivariate Methods for Interstructural Analysis.” CASOS Working Paper, Carnegie Mellon University.

Hamming, R.W. (1950). “Error Detecting and Error Correcting Codes.” Bell System Technical Journal, 29, 147-160.

Examples

Run this code

# NOT RUN {
#Get some random graphs
g<-rgraph(5,5,tprob=runif(5,0,1))

#Find the Hamming distances
hdist(g)
# }

Run the code above in your browser using DataLab