peakdocs: Extract documents related to sentiment peaks

Description

This function extracts the documents with most extreme sentiment (lowest, highest or both in absolute terms). The extracted documents are unique, even when, for example, all most extreme sentiment values (across sentiment calculation methods) occur only for one document.

Usage

peakdocs(sentiment, n = 10, type = "both", do.average = FALSE)

Arguments

sentiment

a sentiment object created using compute_sentiment or to_sentiment.

a positive numeric value to indicate the number of dates associated to sentiment peaks to extract. If n < 1, it is interpreted as a quantile (for example, 0.07 would mean the 7% most extreme dates).

type

a character value, either "pos", "neg" or "both", respectively to look for the n dates related to the most positive, most negative or most extreme (in absolute terms) sentiment occurrences.

do.average

a logical to indicate whether peaks should be selected based on the average sentiment value per date.

Value

A vector of type "character" corresponding to the n extracted document identifiers.

Examples

Run this code

# NOT RUN {
set.seed(505)

data("usnews", package = "sentometrics")
data("list_lexicons", package = "sentometrics")
data("list_valence_shifters", package = "sentometrics")

l <- sento_lexicons(list_lexicons[c("LM_en", "HENRY_en")])

corpus <- sento_corpus(corpusdf = usnews)
corpusSample <- quanteda::corpus_sample(corpus, size = 200)
sent <- compute_sentiment(corpusSample, l, how = "proportionalPol")

# extract the peaks
peaksAbs <- peakdocs(sent, n = 5)
peaksAbsQuantile <- peakdocs(sent, n = 0.50)
peaksPos <- peakdocs(sent, n = 5, type = "pos")
peaksNeg <- peakdocs(sent, n = 5, type = "neg")

# }

Run the code above in your browser using DataLab