Learn R Programming

languageR (version 1.5.0)

lexicalMeasures: Lexical measures for 2233 English monomorphemic words

Description

Lexical distributional measures for 2233 English monomorphemic words. This dataset provides a subset of the data available in the dataset english.

Usage

data(lexicalMeasures)

Arguments

Format

A data frame with 2233 observations on the following 24 variables.

Word

a factor with 2284 words.

CelS

numeric vector with log-transformed lemma frequency in the CELEX lexical database.

Fdif

numeric vector with the logged ratio of written frequency (CELEX) to spoken frequency (British National Corpus).

Vf

numeric vector with log morphological family size.

Dent

numeric vector with derivational entropy.

Ient

numeric vector with inflectional entropy.

NsyS

numeric vector with the log-transformed count of synonym sets in WordNet in which the word is listed.

NsyC

numeric vector with the log-transformed count of synonym sets in WordNet in which the word is listed as part of a compound.

Len

numeric vector with length of the word in letters.

Ncou

numeric vector with orthographic neighborhood density.

Bigr

numeric vector with mean log bigram frequency.

InBi

numeric vector with log frequency of initial diphone.

spelV

numeric vector with type count of orthographic neighbors.

spelN

numeric vector with token count of orthographic neighbors.

phonV

numeric vector with type count of phonological neighbors.

phonN

numeric vector with token count of phonological neighbors.

friendsV

numeric vector with type counts of consistent words.

friendsN

numeric vector with token counts of consistent words.

ffV

numeric vector with type count of forward inconsistent words.

ffN

numeric vector with token count of forward inconsistent words.

fbV

numeric vector with type count of backward inconsistent words.

fbN

numeric vector with token count of backward inconsistent words

ffNonzero

a numeric vector with the count of forward inconsistent words with nonzero frequency.

NVratio

a numeric vector with the logarithmically transformed ratio of the noun and verb frequencies.

References

Baayen, R.H., Feldman, L. and Schreuder, R. (2006) Morphological influences on the recognition of monosyllabic monomorphemic words, Journal of Memory and Language, 53, 496-512.

Examples

Run this code
# NOT RUN {
data(lexicalMeasures)
data(lexicalMeasuresDist)

library(rms)
library(cluster)
plot(varclus(as.matrix(lexicalMeasures[,-1])))

lexicalMeasures.cor = cor(lexicalMeasures[,-1], method = "spearman")^2
lexicalMeasures.dist = dist(lexicalMeasures.cor)
pltree(diana(lexicalMeasures.dist))

data(lexicalMeasuresClasses)
x = data.frame(measure = rownames(lexicalMeasures.cor), 
cluster = cutree(diana(lexicalMeasures.dist), 5),
class = lexicalMeasuresClasses$Class)
x = x[order(x$cluster), ]
x
# }

Run the code above in your browser using DataLab