writtenVariationLijk: Variation in written Dutch in the use of the suffix -lijk

Description

This dataset documents variation in the use of the 80 most frequent words ending in the suffix -lijk in written Dutch.

Usage

data(writtenVariationLijk)

Arguments

Format

A data frame with 560 observations on the following 5 variables.

Corpus: a factor with as levels the sampled newspapers: belang (Het Belang van Limburg), gazet (De Gazet van Antwerpen), laatnieu (Het Laatste Nieuws), limburg (De Limburger), nrc (NRC Handelsblad), stand (De Standaard), and tele (De Telegraaf).
Word: a factor with the 80 most frequent words ending in -lijk.
Count: a numeric vector with token counts in the CONDIV corpus.
Country: a factor with levels Flanders and Netherlands.
Register: a factor with levels National, Quality and Regional coding the type of newspaper.

References

Keune, K., Ernestus, M., Van Hout, R. and Baayen, R.H. (2005) Social, geographical, and register variation in Dutch: From written 'mogelijk' to spoken 'mok', Corpus Linguistics and Linguistic Theory, 1, 183-223.

Examples

Run this code

if (FALSE) {
data(writtenVariationLijk)

require(lme4)
require(lmerTest)
require(lme4)

writtenVariationLijk.lmer = glmer(Count ~ Country * Register + (1|Word), 
  control=glmerControl(optimizer="optimx",optCtrl=list(method="nlminb")),
  data = writtenVariationLijk, family = "poisson")

writtenVariationLijk.lmerA = glmer(Count ~ Country * Register + (Country|Word), 
  control=glmerControl(optimizer="optimx",optCtrl=list(method="nlminb")),
  data = writtenVariationLijk, family = "poisson")

anova(writtenVariationLijk.lmer, writtenVariationLijk.lmerA)

summary(writtenVariationLijk.lmerA)
}

Run the code above in your browser using DataLab