Learn R Programming

syuzhet (version 1.0.6)

get_tokens: Word Tokenization

Description

Parses a string into a vector of word tokens.

Usage

get_tokens(text_of_file, pattern = "\\W", lowercase = TRUE)

Value

A Character Vector of Words

Arguments

text_of_file

A Text String

pattern

A regular expression for token breaking

lowercase

should tokens be converted to lowercase. Default equals TRUE