One-hot encode a text into a list of word indexes in a vocabulary of size n.
text_one_hot(text, n, filters = "!\"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n",
lower = TRUE, split = " ")=>
Input text (string).
Size of vocabulary (integer)
Sequence of characters to filter out.
Whether to convert the input to lowercase.
Sentence split marker (string).
List of integers in [1, n]
. Each integer encodes a word (unicity
non-guaranteed).
Other text preprocessing: make_sampling_table
,
pad_sequences
, skipgrams
,
text_hashing_trick
,
text_to_word_sequence