print.nexusDatasetInfo: Query and download from the Nexus network repository

Description

The Nexus network repository is an online collection of network data sets. These functions can be used to query it and download data from it, directly as an igraph graph.

Usage

# S3 method for nexusDatasetInfo
print(x, ...)
# S3 method for nexusDatasetInfoList
summary(object, ...)
# S3 method for nexusDatasetInfoList
print(x, ...)
nexus_list(
  tags = NULL,
  offset = 0,
  limit = 10,
  operator = c("or", "and"),
  order = c("date", "name", "popularity"),
  nexus.url = igraph_opt("nexus.url")
)
nexus_info(id, nexus.url = igraph_opt("nexus.url"))
nexus_get(
  id,
  offset = 0,
  order = c("date", "name", "popularity"),
  nexus.url = igraph_opt("nexus.url")
)
nexus_search(
  q,
  offset = 0,
  limit = 10,
  order = c("date", "name", "popularity"),
  nexus.url = igraph_opt("nexus.url")
)
# S3 method for nexusDatasetInfoList
[(x, i)

Arguments

x, object

The nexusDatasetInfo object to print.

…

Currently ignored.

Value

nexus_list and nexus_search return a list of nexusDatasetInfo objects. The list also has these attributes:

size: The number of data sets returned by the query.
totalsize: The total number of data sets found for the query.
offset: The offset parameter of the query.
limit: The limit parameter of the query.

nexus_info returns a single nexusDatasetInfo object.

nexus_get returns an igraph graph object, or a list of graph objects, if the data set consists of multiple networks.

Examples

nexus_list(tag="weighted")
nexus_list(limit=3, order="name")
nexus_list(limit=3, order="name")[[1]]
nexus_info(2)
g <- nexus_get(2)
summary(g)

## Data sets related to 'US': nexus_search("US")

## Search for data sets that have 'network' in their name: nexus_search("name:network")

## Any word can match nexus_search("blog or US or karate")

Details

Nexus is an online repository of networks, with an API that allow programmatic queries against it, and programmatic data download as well.

The nexus_list and nexus_info functions query the online database. They both return nexusDatasetInfo objects. nexus_info returns more information than nexus_list.

nexus_search searches Nexus, and returns a list of data sets, as nexusDatasetInfo objects. See below for some search examples.

nexus_get downloads a data set from Nexus, based on its numeric id, or based on a Nexus search string. For search strings, only the first search hit is downloaded, but see also the offset argument. (If there are not data sets found, then the function returns an error.)

The nexusDatasetInfo objects returned by nexus_list have the following fields:

id: The numeric id of the dataset.
sid: The character id of the dataset.
name: Character scalar, the name of the dataset.
vertices/edges: Character, the number of vertices and edges in the graph(s). Vertices and edges are separated by a slash, and if the data set consists of multiple networks, then they are separated by spaces.
tags: Character vector, the tags of the dataset. Directed graph have the tags ‘directed’. Undirected graphs are tagged as ‘undirected’. Other common tags are: ‘weighted’, ‘bipartite’, ‘social network’, etc.
networks: The ids and names of the networks in the data set. The numeric and character id are separated by a slash, and multiple networks are separated by spaces.

nexusDatasetInfo objects returned by nexus_info have the following additional fields:

date

Character scalar, e.g. ‘2011-01-09’, the date when the dataset was added to the database.

formats

Character vector, the data formats in which the data set is available. The various formats are separated by semicolons.

licence

Character scalar, the licence of the dataset.

licence url

Character scalar, the URL of the licence of the dataset. Please make sure you consult this before using a dataset.

summary

Character scalar, the short description of the dataset, this is usually a single sentence.

description

Character scalar, the full description of the dataset.

citation

Character scalar, the paper(s) describing the dataset. Please cite these papers if you are using the dataset in your research, the licence of most datasets requires this.

attributes

A list of lists, each list entry is a graph, vertex or edge attribute and has the following entries:

type: Type of the attribute, either ‘graph’, ‘vertex’ or ‘edge’.
datatype: Data type of the attribute, currently it can be ‘numeric’ and ‘string’.
name: Character scalar, the name of the attribute.
description: Character scalar, the description of the attribute.

The results of the Nexus queries are printed to the screen in a consise format, similar to the format of igraph graphs. A data set list (typically the result of nexus_list and nexus_search) looks like this:

NEXUS 1-5/18 -- data set list
[1] kaptail.4         39/109-223   #18 Kapferer tailor shop
[2] condmatcollab2003 31163/120029 #17 Condensed matter collaborations+
[3] condmatcollab     16726/47594  #16 Condensed matter collaborations+
[4] powergrid         4941/6594    #15 Western US power grid
[5] celegansneural    297/2359     #14 C. Elegans neural network

Each line here represents a data set, and the following information is given about them: the character id of the data set (e.g. kaptail or powergrid), the number of vertices and number of edges in the graph of the data sets. For data sets with multiple graphs, intervals are given here. Then the numeric id of the data set and the remaining space is filled with the name of the data set.

Summary information about an individual Nexus data set is printed as

NEXUS B--- 39 109-223 #18 kaptail -- Kapferer tailor shop
+ tags: directed; social network; undirected
+ nets: 1/KAPFTI2; 2/KAPFTS2; 3/KAPFTI1; 4/KAPFTS1

This is very similar to the header that is used for printing igraph graphs, but there are some differences as well. The four characters after the NEXUS word give the most important properties of the graph(s): the first is ‘U’ for undirected and ‘D’ for directed graphs, and ‘B’ if the data set contains both directed and undirected graphs. The second is ‘N’ named graphs. The third character is ‘W’ for weighted graphs, the fourth is ‘B’ if the data set contains bipartite graphs. Then the number of vertices and number of edges are printed, for data sets with multiple graphs, the smallest and the largest values are given. Then comes the numeric id, and the string id of the data set. The end of the first line contains the name of the data set. The second row lists the data set tags, and the third row the networks that are included in the data set.

Detailed data set information is printed similarly, but it contains more fields.