This function calculates, for an increasing sequence of text sizes, the observed number of types, hapax legomena, dis legomena, tris legomena, and selected measures of lexical richness.
growth.fnc(text = languageR::alice, size = 646, nchunks = 40, chunks = 0)
A vector of strings representing a text.
An integer giving the size of a text chunk when the text is to be split into a series of equally-sized text chunks.
An integer denoting the number of desired equally-sized text chunks.
An integer vector denoting the token sizes for which growth
measures are required. When chunks is specified, size
and
nchunks
are ignored.
A growth object with methods for plotting, printing. As running this function on large texts may take some time, a period is printed on the output device for each completed chunk to indicate progress.
The data frame with the actual measures, which can be extracted with
object.name@data$data
, has the following columns.
Chunk
a numeric vector with chunk numbers.
Tokens
a numeric vector with the number of tokens up to and including the current chunk.
Types
a numeric vector with the number of types up to and including the current chunk.
HapaxLegomena
a numeric vector with the corresponding count of hapax legomena.
DisLegomena
a numeric vector with the corresponding count of dis legomena.
TrisLegomena
a numeric vector with the corresponding count of tris legomena.
Yule
a numeric vector with Yule's K
.
Zipf
a numeric vector with the slope of Zipf's rank-frequency curve in the double-logarithmic plane.
TypeTokenRatio
a numeric vector with the ratio of types to tokens.
Herdan
a numeric vector with Herdan's C
.
Guiraud
a numeric vector with Guiraud's R
.
Sichel
a numeric vector with Sichel's S
.
Lognormal
a numeric vector with mean log frequency.
R. H. Baayen (2001) Word Frequency Distributions, Dordrecht: Kluwer Academic Publishers.
Tweedie, F. J. & Baayen, R. H. (1998) How variable may a constant be? Measures of lexical richness in perspective, Computers and the Humanities, 32, 323-352.
See Also plot.growth
, and the zipfR package.
# NOT RUN {
data(alice)
alice.growth = growth.fnc(alice)
plot(alice.growth)
# }
Run the code above in your browser using DataLab