read.paj: Read a Pajek Project or Network File and Convert to an R 'Network' Object

Description

Return a (list of) network object(s) after reading a corresponding .net or .paj file. The code accepts ragged array edgelists, but cannot currently handle 2-mode, multirelational (e.g. KEDS), or networks with entries for both edges and arcs (e.g. GD-a99m). See network, statnet, or sna for more information.

Usage

read.paj(
  file,
  verbose = FALSE,
  debug = FALSE,
  edge.name = NULL,
  simplify = FALSE,
  time.format = c("pajekTiming", "networkDynamic")
)

Value

The structure of the object returned by read.paj depends on the contents of the file it parses.

if input file contains information about a single 'network' object (i.e .net input file) a single network object is returned with attribute data set appropriately if possible. or a list of networks (for .paj input).
if input file contains multiple sets of relations for a single network, a list of network objects ('network.series') is returned, along with a formula object?.
if input .paj file contains additional information (like partition information), or multiple *Network definitions a two element list is returned. The first element is a list of all the network objects created, and the second is a list of partitions, etc. (how are these matched up)

Arguments

file: the name of the file whence the data are to be read. If it does not contain an absolute path, the file name is relative to the current working directory (as returned by getwd). file can also be a complete URL.
verbose: logical: Should longer descriptions of the reading and coercion process be printed out?
debug: logical: Should very detailed descriptions of the reading and coercion process be printed out? This is typically used to debug the reading of files that are corrupted on coercion.
edge.name: optional name for the edge variable read from the file. The default is to use the value in the project file if found.
simplify: Should the returned network be simplified as much as possible and saved? The values specifies the name of the file which the data are to be stored. If it does not contain an absolute path, the file name is relative to the current working directory (see getwd). If specify is TRUE the file name is the name file.
time.format: if the network has timing information attached to edges/vertices, how should it be processed? 'pajekTiming' will attach the timing information unchanged in an attribute named pajek.timing. 'networkDynamic' will translate it to a spell matrix format, attach it as an 'activity' attribute and add the class 'networkDynamic' -- formating it for use by the networkDynamic package.

Author

Dave Schruth dschruth@u.washington.edu, Mark S. Handcock handcock@stat.washington.edu (with additional input from Alex Montgomery ahm@reed.edu), Skye Bender-deMoll skyebend@uw.edu

Details

If the *Vertices block includes the optional graphic attributes (coordinates, shape, size, etc.) they will be read attached to the network as vertex attributes but values will not be interperted (i.e. Pajek's color names will not be translated to R color names). Vertex attributes included in a *Vector block will be attached as vertex attributes.

Edges or Arc weights in the *Arcs or *Edges block are include in the network as an attribute with the same name as the network. If no weight is included, a default weight of 1 is used. Optional graphic attributes or labels will be attached as edge attributes.

If the file contains an empty Arcs block, an undirected network will be returned. Otherwise the network will be directed, with two edges (one in each direction) added for every row in the *Edges block.

If the *Vertices, *Arcs or *Edges blocks having timing information included in the rows (indicated by ... tokens), it will be attached to the vertices with behavior determined by the time.format option. If the 'networkDynamic' format is used, times will be translated to networkDynamic's spell model with the assumtion that the original Pajek representation was indicating discrete time chunks. For example "[5-10]" will become the spell [5,11], "[2-*]" will become [2,Inf] and "[7]" will become [7,8]. See documentation for networkDynamic's ?activity.attribute for details.

The *Arcslist, *Edgelist and *Events blocks are not yet supported.

As there is no known single complete specification for the file format, parsing behavior has been infered from references and examples below.

References

Batagelj, Vladimir and Mrvar, Andrej (2011) Pajek Reference Manual version 2.05 http://web.archive.org/web/20240906013709/http://vlado.fmf.uni-lj.si/pub/networks/pajek/doc/pajekman.pdf Section 5.3 pp 73-79

Batageli, Vladimir (2008) "Network Analysis Description of Networks" http://web.archive.org/web/20240511173536/http://vlado.fmf.uni-lj.si/pub/networks/doc/ECPR/08/ECPR01.pdf

Pajek Datasets http://web.archive.org/web/20240411203537/http://vlado.fmf.uni-lj.si/pub/networks/data/esna

Examples

Run this code


if (FALSE) {
require(network)

par(mfrow=c(2,2))

test.net.1 <- read.paj("http://vlado.fmf.uni-lj.si/pub/networks/data/GD/gd98/A98.net")
plot(test.net.1,main=test.net.1%n%'title')

test.net.2 <- read.paj("http://vlado.fmf.uni-lj.si/pub/networks/data/mix/USAir97.net")
# plot using coordinates from the file in the file
plot(test.net.2,main=test.net.2%n%'title',
               coord=cbind(test.net.2%v%'x',
               test.net.2%v%'y'),
               jitter=FALSE)
               
# read .paj project file
# notice output has $networks and $partitions
read.paj('http://vlado.fmf.uni-lj.si/vlado/podstat/AO/net/Tina.paj')
}

Run the code above in your browser using DataLab