Learn R Programming

corpora (version 0.6)

BrownStats: Basic statistics of texts in the Brown corpus

Description

This data set provides some basic quantiative measures for all texts in the Brown corpus of written American English (Francis & Kucera 1964),

Usage

BrownStats

Arguments

Format

A data frame with 500 rows and the following columns:

ty:

number of distinct types

to:

number of tokens (including punctuation)

se:

number of sentences

towl:

mean word length in characters, averaged over tokens

tywl:

mean word length in characters, averaged over types

Author

Marco Baroni <baroni@sslmit.unibo.it>

References

Francis, W.~N. and Kucera, H. (1964). Manual of information to accompany a standard sample of present-day edited American English, for use with digital computers. Technical report, Department of Linguistics, Brown University, Providence, RI.

See Also

LOBStats