bigrams: Create bigrams

Description

Create bigrams

Usage

bigrams(text, window = 1, concatenator = "_", include.unigrams = FALSE,
  ignoredFeatures = NULL, skipGrams = FALSE, ...)

Arguments

text

character vector containing the texts from which bigrams will be constructed

window

how many words to be counted for adjacency. Default is 1 for only immediately neighbouring words. This is only available for bigrams, not for ngrams.

concatenator

character for combining words, default is _ (underscore) character

include.unigrams

if TRUE, return unigrams as well

ignoredFeatures

a character vector of features to ignore

skipGrams

If FALSE (default), remove any bigram containing a feature listed in ignoredFeatures, otherwise, first remove the features in ignoredFeatures, and then create bigrams. This means that some "bigrams" will actually no

...

provides additional arguments passed to tokenize

Value

a character vector of bigrams

Examples

Run this code

bigrams("The quick brown fox jumped over the lazy dog.")
bigrams(c("The quick brown fox", "jumped over the lazy dog."))
bigrams(c("The quick brown fox", "jumped over the lazy dog."), window=2)
bigrams(c("I went to tea with her majesty Queen Victoria.", "Does tea have extra caffeine?"))
bigrams(c("I went to tea with her majesty Queen Victoria.", "Does tea have extra caffeine?"),
        ignoredFeatures=stopwords("english"))
bigrams(c("I went to tea with her majesty Queen Victoria.", "Does tea have extra caffeine?"),
        ignoredFeatures=stopwords("english"), skipGrams=TRUE)

Run the code above in your browser using DataLab