Learn R Programming

sentometrics (version 0.5.6)

ctr_agg: Set up control for aggregation into sentiment measures

Description

Sets up control object for aggregation of document-level textual sentiment into textual sentiment measures (indices).

Usage

ctr_agg(howWithin = "proportional", howDocs = "equal_weight",
  howTime = "equal_weight", do.ignoreZeros = TRUE, by = "day",
  lag = 1, fill = "zero", alphasExp = seq(0.1, 0.5, by = 0.1),
  ordersAlm = 1:3, do.inverseAlm = TRUE, aBeta = 1:4, bBeta = 1:4,
  weights = NULL, tokens = NULL, nCore = 1)

Arguments

howWithin

a single character vector defining how aggregation within documents will be performed. Should length(howWithin) > 1, the first element is used. For available options on how this aggregation can occur; see get_hows()$words.

howDocs

a single character vector defining how aggregation across documents per date will be performed. Should length(howDocs) > 1, the first element is used. For available options on how this aggregation can occur; see get_hows()$docs.

howTime

a character vector defining how aggregation across dates will be performed. More than one choice is possible. For available options on how this aggregation can occur; see get_hows()$time.

do.ignoreZeros

a logical indicating whether zero sentiment values have to be ignored in the determination of the document weights while aggregating across documents. By default do.ignoreZeros = TRUE, such that documents with a raw sentiment score of zero or for which a given feature indicator is equal to zero are considered irrelevant.

by

a single character vector, either "day", "week", "month" or "year", to indicate at what level the dates should be aggregated. Dates are displayed as the first day of the period, if applicable (e.g., "2017-03-01" for March 2017).

lag

a single integer vector, being the time lag to be specified for aggregation across time. By default equal to 1, meaning no aggregation across time; a time weighting scheme named "dummyTime" is used in this case.

fill

a single character vector, one of c("zero", "latest", "none"), to control how missing sentiment values across the continuum of dates considered are added. This impacts the aggregation across time, applying the measures_fill function before aggregating, except if fill = "none". By default equal to "zero", which sets the scores (and thus also the weights) of the added dates to zero in the time aggregation.

alphasExp

a numeric vector of all exponential smoothing factors to calculate weights for, used if "exponential" %in% howTime. Values should be between 0 and 1 (both excluded); see weights_exponential.

ordersAlm

a numeric vector of all Almon polynomial orders (positive) to calculate weights for, used if "almon" %in% howTime; see weights_almon.

do.inverseAlm

a logical indicating if for every Almon polynomial its inverse has to be added, used if "almon" %in% howTime; see weights_almon.

aBeta

a numeric vector of positive values as first Beta weighting decay parameter; see weights_beta.

bBeta

a numeric vector of positive values as second Beta weighting decay parameter; see weights_beta.

weights

optional own weighting scheme(s), used if provided as a data.frame with the number of rows equal to the desired lag.

Value

A list encapsulating the control parameters.

Details

For currently available options on how aggregation can occur (via the howWithin, howDocs and howTime arguments), call get_hows.

See Also

measures_fill, almons, compute_sentiment

Examples

Run this code
# NOT RUN {
set.seed(505)

# simple control function
ctr1 <- ctr_agg(howTime = "linear", by = "year", lag = 3)

# more elaborate control function (particular attention to time weighting schemes)
ctr2 <- ctr_agg(howWithin = "proportionalPol",
                howDocs = "proportional",
                howTime = c("equal_weight", "linear", "almon", "beta", "exponential", "own"),
                do.ignoreZeros = TRUE,
                by = "day",
                lag = 20,
                ordersAlm = 1:3,
                do.inverseAlm = TRUE,
                alphasExp = c(0.20, 0.50, 0.70, 0.95),
                aBeta = c(1, 3),
                bBeta = c(1, 3, 4, 7),
                weights = data.frame(myWeights = runif(20)))

# set up control function with one linear and two chosen Almon weighting schemes
a <- weights_almon(n = 70, orders = 1:3, do.inverse = TRUE, do.normalize = TRUE)
ctr3 <- ctr_agg(howTime = c("linear", "own"), by = "year", lag = 70,
                weights = data.frame(a1 = a[, 1], a2 = a[, 3]))

# }

Run the code above in your browser using DataLab