A dataset containing an emoji identifier key and sentiment value. This data
comes from Novak, Smailovic, Sluban, & Mozetic's (2015) emoji sentiment data.
The authors used Twitter data and 83 coders to rate each of the the emoji
uses as negative, neutral, or positive to form a probability distribution
(\(p_{-}, p_{0}, p_{+}\))
(http://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0144296&type=printable)..
The sentiment score is calculated via the authors' formula:
\(\frac{\sum{(-1*p_{-}, 0 * p_{0}, p_{+}})}{\sum{(p_{-}, p_{0}, p_{+}})}\).
This polarity lookup table differs from the other ones included in the
lexicon package in the the first column are not words but identifiers.
These identifiers are found in the emojis_sentiment
data set. The
typical use case is to utilize the textclean or sentimentr
packages' replace_emoji
to swap out emojis for a
more computer friendly identifier.
data(hash_sentiment_emojis)
A data frame with 734 rows and 2 variables
2015 - Department of Knowledge Technologies
x. Words
y. Sentiment
Novak, P. K., Smailovic, J., Sluban, B., and Mozetic, I. (2015) Sentiment of emojis. PLoS ONE 10(12). doi:10.1371/journal.pone.0144296 http://kt.ijs.si/data/Emoji_sentiment_ranking/index.html https://creativecommons.org/licenses/by-sa/4.0/