The partition
class is used to manage subcorpora. It is an S4 class, and
a set of methods is defined for the class. The class inherits
from the classes count
and textstat
.
# S4 method for partition
p_attributes(.Object, p_attribute = NULL, decode = TRUE)# S4 method for subcorpus
p_attributes(.Object, p_attribute = NULL, decode = TRUE)
is.partition(x)
# S4 method for partition
enrich(
.Object,
p_attribute = NULL,
decode = TRUE,
verbose = TRUE,
mc = FALSE,
...
)
# S4 method for partition
as.regions(x)
# S4 method for partition
split(x, gap, ...)
A partition
object.
a p-attribute (for enriching) / performing count.
logical
value, whether to decode token ids into strings when performing count
A partition
object.
logical
value, whether to output messages
logical
or, if numeric, providing the number of cores
further parameters passed into count
when calling enrich
, and ...
An integer value specifying the minimum gap between regions for performing the split.
name
A name to identify the object (character
vector with length 1); useful when multiple
partition
objects are combined to a partition_bundle
.
corpus
The CWB indexed corpus the partition is derived from (character
vector with length 1).
encoding
Encoding of the corpus (character
vector with length 1).
s_attributes
A named list
with the s-attributes specifying the partition.
explanation
Object of class character
, an explanation of the partition.
cpos
A matrix
with left and right corpus positions defining regions (two columns).
annotations
Object of class list
.
size
Total size of the partition (integer
vector, length 1).
stat
An (optional) data.table
with counts. If present, speeds up computation of cooccurrences,
as count is already present.
metadata
Object of class data.frame
, metadata information.
strucs
Object of class integer
, the strucs defining the partition.
p_attribute
Object of class character
indicating the p_attribute of the
count in slot stat
.
xml
Object of class character
, whether the xml is flat or nested.
s_attribute_strucs
Object of class character
the base node
key
Experimental, an s-attribute that is used as a key.
call
Object of class character
the call that generated the partition
Andreas Blaette
As partition
objects inherit from count
and
textstat
class, methods available are view
to inspect the
table in the stat
slot, name
and name<-
to
retrieve/set the name of an object, and more.
The is.partition
function returns a logical
value
whether x
is a partition
, or not.
The enrich
-method will add a count of tokens defined by argument
p_attribute
to slot stat
of the partition
object.
The split()
-method will split a partition object into a
partition_bundle
if gap between strucs exceeds a minimum number of tokens
specified by gap
. Relevant to split up a plenary protocol into speeches.
Note: To speed things up, the returned partitions will not include
frequency lists. The lists can be prepared by applying enrich
on the
partition_bundle
object that is returned.
The partition
-class inherits from the
textstat-class
, see respective documentation to learn more.
p <- partition(
"GERMAPARLMINI",
date = "2009-11-11",
speaker = "Norbert Lammert"
)
name(p) <- "Norbert Lammert"
pb <- split(p, gap = 500L)
summary(pb)
Run the code above in your browser using DataLab