This function is a stripped down version of lex.div
. It does not analyze text,
but takes the numbers of tokens and types directly to calculate measures for which this information is sufficient:
"TTR"
The classic Type-Token Ratio
"C"
Herdan's C
"R"
Guiraud's Root TTR
"CTTR"
Carroll's Corrected TTR
"U"
Dugast's Uber Index
"S"
Summer's index
"Maas"
Maas' (\(a^2\))
See lex.div
for further details on the formulae.
lex.div.num(num.tokens, num.types, measure = c("TTR", "C", "R", "CTTR", "U",
"S", "Maas"), log.base = 10, quiet = FALSE)
Numeric, the number of tokens.
Numeric, the number of types.
A character vector defining the measures to calculate.
A numeric value defining the base of the logarithm. See log
for details.
Logical. If FALSE
, short status messages will be shown.
TRUE
will also suppress all potential warnings regarding the validation status of measures.
An object of class kRp.TTR-class
.
Maas, H.-D., (1972). \"Uber den Zusammenhang zwischen Wortschatzumfang und L\"ange eines Textes. Zeitschrift f\"ur Literaturwissenschaft und Linguistik, 2(8), 73--96.
Tweedie. F.J. & Baayen, R.H. (1998). How Variable May a Constant Be? Measures of Lexical Richness in Perspective. Computers and the Humanities, 32(5), 323--352.
# NOT RUN {
lex.div.num(104, 43)
# }
Run the code above in your browser using DataLab