dna.network: Get network data

Description

Convert a .dna file into a sociomatrix and import it into R.

Usage

dna.network(infile, algorithm = "cooccurrence",  agreement = "combined", start.date = "01.01.1900",  stop.date = "31.12.2099", two.mode.type = "oc",  one.mode.type = "organizations", via = "categories",  ignore.duplicates = TRUE, include.isolates = FALSE,  normalization = FALSE, window.size = 100, step.size = 1,  alpha = 100, lambda = 0.1, ignore.agreement = FALSE,  exclude.persons = c(""), exclude.organizations = c(""),  exclude.categories = c(""), invert.persons = FALSE,  invert.organizations = FALSE, invert.categories = FALSE,  verbose = TRUE)

Arguments

infile

The input .dna file as a string (i.e., enclosed in quotation marks). If the file is not in the current working directory, specify the path together with the file name. Include the file suffix. Example: sample.dna.

algorithm

The algorithm which should be used to create the network. Refer to the DNA manual at http://www.philipleifeld.de for details. Possible values are: affiliation (for a two-mode network of actors and concepts), cooccurrence (for an actor or concept co-occurrence/one-mode network), timewindow (for the time window algorithm) and attenuation (for the attenuation algorithm).

agreement

The agreement pattern to be used. Must be one of the following: yes, no, combined or conflict.

start.date

Only statements after this date will be retained. The start date is a character string of the form dd.mm.yyyy, where dd is the two-digit day, mm the two-digit month and yyyy the four-digit year.

stop.date

Only statements before this date will be retained. The stop date is a character string of the form dd.mm.yyyy, where dd is the two-digit day, mm the two-digit month and yyyy the four-digit year.

two.mode.type

If the affiliation algorithm is selected, this argument determines the vertex classes to be used. The following values are possible: oc (which stands for organizations x categories), pc (persons x categories) and po (persons x organizations).

one.mode.type

If the cooccurrence algorithm, the timewindow algorithm or the attenuation algorithm is selected, this argument specifies the vertex class to be used. The following values are possible: organizations (which stands for organizations x organizations), persons (persons x persons) or categories (categories x categories).

via

If the one.mode.type argument is active (i.e., the cooccurrence, timewindow or attenuation algorithm is used), this argument specifies via which variable a co-occurrence network is created. For example, if an organizations x organizations network is created, organizations can either be connected via their shared persons or categories. Valid values are thus organizations, persons and categories, but not the vertex type used in the one.mode.type argument.

ignore.duplicates

A boolean variable indicating whether two statements with the same actor, category, agreement pattern and date should be counted separately during network creation. For example, if a speaker re-iterates the same concepts in the same way over and over again in the same article, each of these statements increases the edge weight between this speaker and other speakers using the same argument if ignore.duplicates is switched off (i.e., set to FALSE).

include.isolates

If several time slices are exported, usually the network matrices will have different dimensions. If the include.isolates argument is set to TRUE, all actors - even if they are inactive in the current time slice - are included in the matrix. This guarantees that several time slices have the same dimensions and the same order of actors.

normalization

Some actors make statements more frequently than others, and this behavior is caused by their institutional position. These actors are likely to be at the center of the network. If normalization is set to TRUE, DNA tries to correct for institutional positions by dividing edge weights by the average total number of statements of both actors involved in an edge. For more details, please refer to the DNA manual at http://www.philipleifeld.de.

window.size

If the timewindow algorithm is used, the window.size argument controls the size of the time window. Integer values are possible. Recommended values are somewhere between 10 and 2000 days, depending on the theory and the dataset.

step.size

If the timewindow algorithm is used, the step.size argument controls the rate at which the time window moves, i.e., the number of days by which the window is moved at each step. Using 1 day is recommended. For non-overlapping time windows, use the same value as in the window.size argument.

alpha

If the timewindow algorithm is used and normalization=TRUE, the step.size argument provides a constant by which edge values are multiplied. This is useful because normalized edge weights in the time window algorithm may become quite small.

lambda

If the attenuation algorithm is used, lambda provides the decay constant for the exponential decay function. The default value of 0.1 attributes relatively high weight to statements which are made within approximately five to ten days.

ignore.agreement

This argument is only used if algorithm="attenuation" is set. When using the attenuation algorithm, ignore.agreement=TRUE specifies that the agreement variable should be ignored completely. For example, if the initial statement is positive and another actor uses the same concept in a negative way, an edge is established nevertheless -- even if agreement="combined" is set, which would normally distinguish between positive and negative relations and add them up). The partial edge value is subject to the exponential decay function using constant lambda.

exclude.persons

Specify a list of persons to be excluded from the network. For example, c("person 1", "person 2"). Note that the names must appear exactly as they are used on the dataset.

exclude.organizations

Specify a list of organizations to be excluded from the network. For example, c("organization 1", "organization 2"). Note that the names must appear exactly as they are used on the dataset.

exclude.categories

Specify a list of categories to be excluded from the network. For example, c("category 1", "category 2"). Note that the concept names must appear exactly as they are used on the dataset.

invert.persons

Reverse the selection of persons. If TRUE, the persons specified in the exclude.persons argument will be included, not excluded. No other persons will be included.

invert.organizations

Reverse the selection of organizations. If TRUE, the organizations specified in the exclude.organizations argument will be included, not excluded. No other organizations will be included.

invert.categories

Reverse the selection of categories. If TRUE, the categories specified in the exclude.categories argument will be included, not excluded. No other categories will be included.

verbose

If true, details about the data import and its progress will be printed. If false, these information will be suppressed.

Details

Specify an input .dna file, specify options for generating a network, and transfer the network as a matrix object into R.

Examples

Run this code

# download files and initialize DNA:
download.file("http://www.philipleifeld.de/cms/upload/Downloads/dna-1.31.jar",
    destfile = "dna-1.31.jar", mode = "wb")
download.file("http://www.philipleifeld.de/cms/upload/Downloads/sample.dna", 
    destfile = "sample.dna", mode = "wb")
dna.init("dna-1.31.jar")

## Not run: 
# # plot a congruence network using the statnet package:
# congruence <- dna.network("sample.dna", exclude.categories = 
#   "There should be legislation to regulate emissions.")
# library("network")
# congruence.nw <- network(congruence)
# plot(congruence.nw, displaylabels = TRUE, label.cex = 0.6, pad = 0.8)
# ## End(Not run)

# do a hierarchical cluster analysis with an affiliation network:
affiliation.yes <- dna.network("sample.dna", algorithm = "affiliation", 
    agreement = "yes", include.isolates = TRUE)
affiliation.no <- dna.network("sample.dna", algorithm = "affiliation", 
    agreement = "no", include.isolates = TRUE)
affiliation <- cbind(affiliation.yes, affiliation.no)
affiliation <- affiliation[rowSums(affiliation) > 0, ] #remove isolates
distances <- dist(affiliation, method = "binary")
clustering <- hclust(distances)
plot(clustering)

Run the code above in your browser using DataLab

Description

Usage

Arguments

Details

See Also

Examples