get_tokens: Word Tokenization
Description
Parses a string into a vector of word tokens.
Usage
get_tokens(text_of_file, pattern = "\\W", lowercase = TRUE)
Value
A Character Vector of Words
Arguments
- text_of_file
A Text String
- pattern
A regular expression for token breaking
- lowercase
should tokens be converted to lowercase. Default equals TRUE