The sentometrics package is an integrated framework for textual sentiment time series aggregation and prediction. It accounts for the intrinsic challenge that, for a given text, sentiment can be computed in many different ways, as well as the large number of possibilities to pool sentiment across texts and time. This additional layer of manipulation does not exist in standard text mining and time series analysis packages. The package therefore integrates the fast qualification of sentiment from texts, the aggregation into different sentiment time series and the optimized prediction based on these measures.
Corpus features generation: sento_corpus
, add_features
Sentiment computation and aggregation into sentiment measures: ctr_agg
,
sento_lexicons
, compute_sentiment
, aggregate.sentiment
,
sento_measures
, peakdocs
, peakdates
, and a series of
measures_xyz
, generic and extractor functions
Sparse modelling: ctr_model
, sento_model
Prediction and post-modelling analysis: predict.sentomodel
, attributions
The development version of the package resides at https://github.com/sborms/sentometrics.
Ardia, Bluteau and Boudt (2018). ``Questioning the news about economic growth: Sparse forecasting using thousands of news-based sentiment values''. International Journal of Forecasting, forthcoming, https://doi.org/10.2139/ssrn.2976084.
Ardia, Bluteau, Borms and Boudt (2018). ``The R package sentometrics to compute, aggregate and predict with textual sentiment''. Working paper, https://doi.org/10.2139/ssrn.3067734.
Useful links: