This function is a stripped down version of lex.div
. It does not analyze text,
but takes the numbers of tokens and types directly to calculate measures for which this information is sufficient:
"TTR"
The classic Type-Token Ratio
"C"
Herdan's C
"R"
Guiraud's Root TTR
"CTTR"
Carroll's Corrected TTR
"U"
Dugast's Uber Index
"S"
Summer's index
"Maas"
Maas' (\(a^2\))
See lex.div
for further details on the formulae.
lex.div.num(
num.tokens,
num.types,
measure = c("TTR", "C", "R", "CTTR", "U", "S", "Maas"),
log.base = 10,
quiet = FALSE
)
Numeric, the number of tokens.
Numeric, the number of types.
A character vector defining the measures to calculate.
A numeric value defining the base of the logarithm. See log
for details.
Logical. If FALSE
, short status messages will be shown.
TRUE
will also suppress all potential warnings regarding the validation status of measures.
An object of class kRp.TTR
.
Maas, H.-D., (1972). \"Uber den Zusammenhang zwischen Wortschatzumfang und L\"ange eines Textes. Zeitschrift f\"ur Literaturwissenschaft und Linguistik, 2(8), 73--96.
Tweedie. F.J. & Baayen, R.H. (1998). How Variable May a Constant Be? Measures of Lexical Richness in Perspective. Computers and the Humanities, 32(5), 323--352.
# NOT RUN {
lex.div.num(104, 43)
# }
Run the code above in your browser using DataLab