EvertLuedeling2001: Samples of German Word Formation Affixes (zipfR)
Description
Corpus data for measuring the productivity of German word formation affixes
-bar, -lich, -sam, -<U+00F6>s, -tum,
Klein-, -chen and -lein (Evert & L<U+00FC>deling 2001).
Data were extracted from two volumes of the German daily newspaper
Stuttgarter Zeitung, then manually cleaned and normalized.
Usage
EvertLuedeling2001
Arguments
Format
A list of 8 character vectors for the different affixes, with names
klein (Klein-), bar (-bar),
chen (-chen), lein (-lein),
lich (-lich), oes (-<U+00F6>s),
sam (-sam), tum (-tum).
Each vector contains all relevant tokens from the corpus in their
original (chronological) ordering, so vocabulary growth curves can
be determined from the vectors in addition to type frequency lists
and frequency spectra.
References
Evert, Stefan and L<U+00FC>deling, Anke (2001).
Measuring morphological productivity: Is automatic preprocessing sufficient?
In Proceedings of the Corpus Linguistics 2001 Conference, pages 167--175, Lancaster, UK.
# NOT RUN {str(EvertLuedeling2001)
# tokens and type counts for the different affixessapply(EvertLuedeling2001, function (x) {
y <- vec2tfl(x)
c(N=N(y), V=V(y))
})
# }