Class to organize information of context analysis.
# S4 method for context
length(x)# S4 method for context
p_attributes(.Object)
# S4 method for context
count(.Object)
# S4 method for context
sample(x, size)
# S4 method for context
enrich(
.Object,
s_attribute = NULL,
p_attribute = NULL,
decode = FALSE,
stat = FALSE,
verbose = TRUE,
...
)
# S4 method for context
as.regions(x, node = TRUE)
# S4 method for context
trim(
.Object,
s_attribute = NULL,
positivelist = NULL,
p_attribute = p_attributes(.Object),
regex = FALSE,
stoplist = NULL,
fn = NULL,
verbose = TRUE,
progress = TRUE,
...
)
A context
object.
A context
object.
An integer
indicating sample size.
The s-attribute(s) to add to data.table
in slot cpos
.
The p-attribute(s) to add to data.table
in slot cpos
.
A logical
value, whether to convert integer ids to expressive
strings.
A logical
value, whether to generate / update slot stat
from
the cpos
table.
A logical
, whether to be talkative.
To maintain backwards compatibility if argument pAttribute
is
still used.
A logical value, whether to include the node (i.e. query matches) in the region matrix
generated when creating a partition
from a context
-object.
Tokens that are required to be present to keep a match.
A logical
value, whether arguments positivlist
/ stoplist
are interpreted as regular expressions.
Tokens that are used to exclude a match.
A function that will be applied on context tables splitted by match_id.
A logical
value, whether to show progress bar
query
The query examined (character
).
count
An integer
value, the number of hits for the query.
partition
The partition
the context
object is based on.
size_partition
The size of the partition, a length-one integer
vector.
left
A length-one integer
value, the number of tokens to the left of the query match.
right
An integer
value, the number of tokens to the right of the query match.
size
A length-one integer
value, the number of tokens covered by
the context
-object, i.e. the number of tokens in the right and left context
of the node as well as query matches.
size_match
A length-one integer
value, the number of tokens
matches by the query. Identical with the value in slot count
if the query
is not a CQP query.
size_coi
A length-one integer
value, the number of tokens in the
right and left context of the node (excluding query matches).
size_ref
A length-one integer
value, the number of tokens in the
partition, without tokens matched and the tokens in the left and right
context.
boundary
An s-attribute (character
).
p_attribute
The p-attribute of the query (character
).
corpus
The CWB corpus used (character
).
stat
A data.table
, the statistics of the analysis.
encoding
Object of class character
, encoding of the corpus.
cpos
A data.table
, with the columns match_id, cpos, position, word_id.
method
A character
-vector, statistical test used.
call
Object of class character
, call that generated the object.
Objects of the class context
include a data.table
in the
slot cpos
. The data.table
will at least include the columns "match_id",
"cpos" and "position".
The length
-method will return the number of hits that were achieved.
The enrich()
-method can be used to add additional information to
the data.table
in the cpos
-slot of a context
-object.
# Keep matches for 'oil' only if first position to the left is 'crude'
.fn <- function(x) if (x[position == -1L][["word"]] == "crude") x else NULL
crude_oil <- context("REUTERS", "oil") %>%
enrich(p_attribute = "word", decode = TRUE) %>%
trim(fn = .fn)
Run the code above in your browser using DataLab