Learn R Programming

polmineR (version 0.8.8)

subcorpus_bundle-class: Bundled subcorpora

Description

A subcorpus_bundle object combines a set of subcorpus objects in a list in the the slot objects. The class inherits from the partition_bundle and the bundle class. Typically, a subcorpus_bundle is generated by applying the split-method on a corpus or subcorpus.

Usage

# S4 method for subcorpus_bundle
show(object)

# S4 method for subcorpus_bundle merge(x, name = "", verbose = FALSE)

# S4 method for subcorpus merge(x, y, ...)

# S4 method for subcorpus split( x, s_attribute, values, prefix = "", mc = getOption("polmineR.mc"), verbose = TRUE, progress = FALSE, type = get_type(x) )

# S4 method for corpus split( x, s_attribute, values = NULL, prefix = "", mc = getOption("polmineR.mc"), verbose = TRUE, progress = FALSE, type = get_type(x), xml = "flat" )

# S4 method for subcorpus_bundle split( x, s_attribute, prefix = "", progress = TRUE, mc = getOption("polmineR.mc") )

Arguments

object

An object of class subcorpus_bundle.

x

A corpus, subcorpus, or subcorpus_bundle object.

name

The name of the new subcorpus object.

verbose

Logical, whether to provide progress information.

y

A subcorpus to be merged with x.

...

Further subcorpus objects to be merged with x and y.

s_attribute

The s-attribute to vary.

values

Either a character vector with values used for splitting, or a logical value: If TRUE, changes of s-attribute values will be the basis for generating subcorpora. If FALSE, a new subcorpus is generated for every struc of the s-attribute. If missing (default), TRUE/FALSE is assigned depending on whether s-attribute has values, or not.

prefix

A character vector that will be attached as a prefix to partition names.

mc

Logical, whether to use multicore parallelization.

progress

Logical, whether to show progress bar.

type

The type of partition to generate.

xml

A logical value.

Details

Applying the split-method to a subcorpus_bundle-object will iterate through the subcorpus, and apply split on each subcorpus object in the bundle, splitting it up by the s-attribute provided by the argument s_attribute. The return value is a subcorpus_bundle, the names of which will be the names of the incoming partition_bundle concatenated with the s-attribute values used for splitting. The argument prefix can be used to achieve a more descriptive name.

Examples

Run this code
corpus("REUTERS") %>% split(s_attribute = "id") %>% summary()

# Merge multiple subcorpus objects
a <- corpus("GERMAPARLMINI") %>% subset(date == "2009-10-27")
b <- corpus("GERMAPARLMINI") %>% subset(date == "2009-10-28")
c <- corpus("GERMAPARLMINI") %>% subset(date == "2009-11-10")
y <- merge(a, b, c)
s_attributes(y, "date")
sc <- subset("GERMAPARLMINI", date == "2009-11-11")
b <- split(sc, s_attribute = "speaker")

p <- partition("GERMAPARLMINI", date = "2009-11-11")
y <- partition_bundle(p, s_attribute = "speaker")
gparl <- corpus("GERMAPARLMINI")
b <- split(gparl, s_attribute = "date")
# split up objects in partition_bundle by using partition_bundle-method
use("polmineR")
y <- corpus("GERMAPARLMINI") %>%
  split(s_attribute = "date") %>%
  split(s_attribute = "speaker")

summary(y)

Run the code above in your browser using DataLab