Learn R Programming

quanteda (version 0.8.2-1)

bigrams: Create bigrams

Description

Create bigrams

Usage

bigrams(text, window = 1, concatenator = "_", include.unigrams = FALSE,
  ignoredFeatures = NULL, skipGrams = FALSE, ...)

Arguments

text
character vector containing the texts from which bigrams will be constructed
window
how many words to be counted for adjacency. Default is 1 for only immediately neighbouring words. This is only available for bigrams, not for ngrams.
concatenator
character for combining words, default is _ (underscore) character
include.unigrams
if TRUE, return unigrams as well
ignoredFeatures
a character vector of features to ignore
skipGrams
If FALSE (default), remove any bigram containing a feature listed in ignoredFeatures, otherwise, first remove the features in ignoredFeatures, and then create bigrams. This means that some "bigrams" will actually no
...
provides additional arguments passed to tokenize

Value

  • a character vector of bigrams

Examples

Run this code
bigrams("The quick brown fox jumped over the lazy dog.")
bigrams(c("The quick brown fox", "jumped over the lazy dog."))
bigrams(c("The quick brown fox", "jumped over the lazy dog."), window=2)
bigrams(c("I went to tea with her majesty Queen Victoria.", "Does tea have extra caffeine?"))
bigrams(c("I went to tea with her majesty Queen Victoria.", "Does tea have extra caffeine?"),
        ignoredFeatures=stopwords("english"))
bigrams(c("I went to tea with her majesty Queen Victoria.", "Does tea have extra caffeine?"),
        ignoredFeatures=stopwords("english"), skipGrams=TRUE)

Run the code above in your browser using DataLab