Learn R Programming

corpora (version 0.6)

LOBStats: Basic statistics of texts in the LOB corpus

Description

This data set provides some basic quantiative measures for all texts in the LOB corpus of written British English (Johansson et al. 1978).

Usage

LOBStats

Arguments

Format

A data frame with 500 rows and the following columns:

ty:

number of distinct types

to:

number of tokens (including punctuation)

se:

number of sentences

towl:

mean word length in characters, averaged over tokens

tywl:

mean word length in characters, averaged over types

Author

Marco Baroni <baroni@sslmit.unibo.it>

References

Johansson, Stig; Leech, Geoffrey; Goodluck, Helen (1978). Manual of information to accompany the Lancaster-Oslo/Bergen corpus of British English, for use with digital computers. Technical report, Department of English, University of Oslo, Oslo.

See Also

BrownStats