Learn R Programming

quanteda.textmodels (version 0.9.9)

data_corpus_moviereviews: Movie reviews with polarity from Pang and Lee (2004)

Description

A corpus object containing 2,000 movie reviews classified by positive or negative sentiment.

Usage

data_corpus_moviereviews

Arguments

Format

The corpus includes the following document variables:

sentiment

factor indicating whether a review was manually classified as positive pos or negative neg.

id1

Character counting the position in the corpus.

id2

Random number for each review.

Details

For more information, see cat(meta(data_corpus_moviereviews, "readme")).

References

Pang, B., Lee, L. (2004) "A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts.", Proceedings of the ACL.

Examples

Run this code
# check polarities
table(data_corpus_moviereviews$sentiment)

# make the data into sentences, because each line is a sentence
data_corpus_moviereviewsents <-
    quanteda::corpus_segment(data_corpus_moviereviews, "\n", extract_pattern = FALSE)
print(data_corpus_moviereviewsents, max_ndoc = 3)

Run the code above in your browser using DataLab