Frequency spectra included as examples in Baayen (2001).
Baayen2001A list of 23 frequency spectra, i.e. objects of class spc.
List elements are named according to the original files, but without the extension .spc.
See Baayen (2001, pp. 249-277) for details.
In particular, the following spectra are included:
alice:Lewis Carroll, Alice's Adventures in Wonderland
through:Lewis Carroll, Through the Looking-Glass and What Alice Found There
war:H. G. Wells, War of the Worlds
hound:Arthur Conan-Doyle, Hound of the Baskervilles
havelaar:E. Douwes Dekker, Max Havelaar
turkish:An archeology text (Turkish)
estonian:A. H. Tammsaare, Truth and Justice (Estonian)
bnc:The context-governed subcorpus of the British National Corpus (BNC)
in1:Sample of 1 million tokens from The Independent
in8:Sample of 8 million tokens from The Independent
heid:Nouns in -heid in the CELEX database (Dutch)
iteit:Nouns in -iteit in the CELEX database (Dutch)
ster:Nouns in -ster in the CELEX database (Dutch)
in:Nouns in -in in the CELEX database (Dutch)
nouns:Simplex nouns in the CELEX database (Dutch)
sing:Singular nouns in M. Innes, The Bloody Wood
plur:Plural nouns in M. Innes, The Bloody Wood
nessw:Nouns in -ness in the written subcorpus of the BNC
nesscg:Nouns in -ness in the context-governed subcorpus of the BNC
nessd:Nouns in -ness in the demographic subcorpus of the BNC
filarial:Counts of filarial worms in mites on rats
cv:Context-vowel patterns in the TIMIT speech database
pairs:Word pairs in E. Douwes Dekker, Max Havelaar
Baayen, R. Harald (2001). Word Frequency Distributions. Kluwer, Dordrecht.
# NOT RUN {
Baayen2001$alice
# }
Run the code above in your browser using DataLab