Learn R Programming

dials (version 0.1.0)

vocabulary_size: Number of tokens in vocabulary

Description

Used in textrecipes::step_tokenize_sentencepiece() and textrecipes::step_tokenize_bpe().

Usage

vocabulary_size(range = c(1000L, 32000L), trans = NULL)

Arguments

range

A two-element vector holding the defaults for the smallest and largest possible values, respectively.

trans

A trans object from the scales package, such as scales::log10_trans() or scales::reciprocal_trans(). If not provided, the default is used which matches the units used in range. If no transformation, NULL.

Examples

Run this code
# NOT RUN {
vocabulary_size()
# }

Run the code above in your browser using DataLab