Data sets used for mainly internal purposes by the quanteda package.
data_int_syllablesdata_char_wordlists
An object of class integer
of length 133245.
data_int_syllables
provides an English-language syllables dictionary;
it is an integer vector whose element names correspond to English words.
data_char_wordlists
provides word lists used in some readability indexes;
it is a named list of character vectors where each list element
corresponds to a different readability index.
These are:
DaleChall
The long Dale-Chall list of 3,000 familiar (English) words needed to compute the Dale-Chall Readability Formula.
Spache
The revised Spache word list (see Klare 1975, 73) needed to compute the Spache Revised Formula of readability (Spache 1974.
Chall, J.S., & Dale, E. (1995). Readability Revisited: The New Dale-Chall Readability Formula. Brookline Books.
Dale, E. & Chall, J.S. (1948). A Formula for Predicting Readability. Educational Research Bulletin, 27(1): 11--20.
Dale, E. & Chall, J.S. (1948). A Formula for Predicting Readability: Instructions. Educational Research Bulletin, 27(2): 37--54.
Klare, G.R. (1975). Assessing Readability. Reading Research Quarterly 10(1), 62--102.
Spache, G. (1953). A New Readability Formula for Primary-Grade Reading Materials. The Elementary School Journal, 53, 410--413.
Tr<U+00E4>nkle, U. & Bailer, H. (1984). Kreuzvalidierung und Neuberechnung von Lesbarkeitsformeln f<U+00FC>r die deutsche Sprache. Zeitschrift f<U+00FC>r Entwicklungspsychologie und P<U+00E4>dagogische Psychologie, 16(3), 231--244.
Wheeler, L.R. & Smith, E.H. (1954). A Practical Readability Formula for the Classroom Teacher in the Primary Grades. Elementary English, 31, 397--399.