select_measures: Select a subset of sentiment measures

Description

Selects the subset of sentiment measures which include either all of the given selection components combined, or those who's name consist of at least one of the selection components. One can also extract measures within a subset of dates.

Usage

select_measures(sentomeasures, toSelect = "all", do.combine = TRUE,
  dates = NA)

Arguments

sentomeasures

a sentomeasures object created using sento_measures.

toSelect

a "character" vector of the lexicon, feature and time weighting scheme names, to indicate which measures need to be selected. By default equal to "all", which means no selection of the sentiment measures is made; this may be used if one only wants to extract a subset of dates via the dates argument.

do.combine

a logical indicating if only measures for which all (do.combine = TRUE) or at least one (do.combine = FALSE) of the selection components should occur in each sentiment measure's name in the subset. If do.combine = TRUE, the toSelect argument can only consist of one lexicon, one feature, and one time weighting scheme at maximum.

dates

any expression, in the form of a character vector, that would correctly evaluate to a logical vector, features the variable date and has dates specified as "yyyy-mm-dd", e.g. dates = "date >= '2000-01-15'". This argument may also be a vector of class Date which extracts all dates that show up in that vector. See the examples. By default equal to NA, meaning no subsetting based on dates is done.

Value

A modified sentomeasures object, with only the sentiment measures required, including updated information and statistics, but the original sentiment scores data.table untouched.

Examples

Run this code

# NOT RUN {
data("usnews")
data("lexicons")
data("valence")

# construct a sentomeasures object to start with
corpus <- sento_corpus(corpusdf = usnews)
corpusSample <- quanteda::corpus_sample(corpus, size = 1000)
l <- setup_lexicons(lexicons[c("LM_eng", "HENRY_eng")], valence[["valence_eng"]])
ctr <- ctr_agg(howTime = c("equal_weight", "linear"), by = "year", lag = 3)
sentomeasures <- sento_measures(corpusSample, l, ctr)

# different selections
sel1 <- select_measures(sentomeasures, c("equal_weight"))
sel2 <- select_measures(sentomeasures, c("equal_weight", "linear"), do.combine = FALSE)
sel3 <- select_measures(sentomeasures, c("linear", "LM_eng"))
sel4 <- select_measures(sentomeasures, c("linear", "LM_eng", "wsj", "economy"),
                        do.combine = FALSE)
sel5 <- select_measures(sentomeasures, c("linear", "LM_eng"),
                        dates = "date >= '1996-12-31' & date <= '2000-12-31'")
d <- seq(as.Date("2000-01-01"), as.Date("2013-12-01"), by = "month")
sel6 <- select_measures(sentomeasures, c("linear", "LM_eng"), dates = d)

# }

Run the code above in your browser using DataLab