Learn R Programming

RNewsflow (version 1.2.6)

create_document_network: Create a document similarity network

Description

Combines document similarity data (d) with document meta data (meta) into an igraph network/graph.

Usage

create_document_network(
  d,
  meta,
  id_var = "document_id",
  date_var = "date",
  min_similarity = NA
)

Value

A network/graph in the igraph class

Arguments

d

A data.frame with three columns, that represents an edgelist with weight values. The first and second column represent the names/ids of the 'from' and 'to' documents/vertices. The third column represents the similarity score. Column names are ignored

meta

A data.frame where rows are documents and columns are document meta information. Should at least contain 2 columns: the document name/id and date. The name/id column should match the document names/ids of the edgelist, and its label is specified in the `id_var` argument. The date column should be intepretable with as.POSIXct, and its label is specified in the `date_var` argument.

id_var

The label for the document name/id column in the `meta` data.frame. Default is "document_id"

date_var

The label for the document date column in the `meta` data.frame . default is "date"

min_similarity

For convenience, ignore all edges where the weight is below `min_similarity`.

Details

This function is mainly offered to mimic the output of the as_document_network function when using imported document similarity data. This way the functions for transforming, aggregating and visualizing the document similarity data can be used.

Examples

Run this code
d = data.frame(x = c(1,1,1,2,2,3),
               y = c(2,3,5,4,5,6),
               v = c(0.3,0.4,0.7,0.5,0.2,0.9))

meta = data.frame(document_id = 1:8,
                  date = seq.POSIXt(from = as.POSIXct('2010-01-01 12:00:00'), 
                         by='hour', length.out = 8),
                  medium = c(rep('Newspapers', 4), rep('Blog', 4)))

g = document.network(d, meta)

igraph::get.data.frame(g, 'both')
igraph::plot.igraph(g)

Run the code above in your browser using DataLab