Learn R Programming

languageR (version 1.5.0)

english: English visual lexical decision and naming latencies

Description

This data set gives mean visual lexical decision latencies and word naming latencies to 2284 monomorphemic English nouns and verbs, averaged for old and young subjects, with various predictor variables.

Usage

data(english)

Arguments

Format

A data frame with 4568 observations on the following variables.

RTlexdec

numeric vector of log RT in visual lexical decision.

RTnaming

numeric vector of log RT in word naming.

Familiarity

numeric vector of subjective familiarity ratings.

Word

a factor with 2284 words.

AgeSubject

a factor with as levels the age group of the subject: young versus old.

WordCategory

a factor with as levels the word categories N (noun) and V (verb).

WrittenFrequency

numeric vector with log frequency in the CELEX lexical database.

WrittenSpokenFrequencyRatio

numeric vector with the logged ratio of written frequency (CELEX) to spoken frequency (British National Corpus).

FamilySize

numeric vector with log morphological family size.

DerivationalEntropy

numeric vector with derivational entropy.

InflectionalEntropy

numeric vector with inflectional entropy.

NumberSimplexSynsets

numeric vector with the log-transformed count of synonym sets in WordNet in which the word is listed.

NumberComplexSynsets

numeric vector with the log-transformed count of synonym sets in WordNet in which the word is listed as part of a compound.

LengthInLetters

numeric vector with length of the word in letters.

Ncount

numeric vector with orthographic neighborhood density, defined as the number of lemmas in CELEX with the same length (in letters) at Hamming distance 1.

MeanBigramFrequency

numeric vector with mean log bigram frequency.

FrequencyInitialDiphone

numeric vector with log frequency of initial diphone.

ConspelV

numeric vector with type count of orthographic neighbors.

ConspelN

numeric vector with token count of orthographic neighbors.

ConphonV

numeric vector with type count of phonological neighbors.

ConphonN

numeric vector with token count of phonological neighbors.

ConfriendsV

numeric vector with type counts of consistent words.

ConfriendsN

numeric vector with token counts of consistent words.

ConffV

numeric vector with type count of forward inconsistent words

ConffN

numeric vector with token count of forward inconsistent words

ConfbV

numeric vector with type count of backward inconsistent words

ConfbN

numeric vector with token count of backward inconsistent words

NounFrequency

numeric vector with the frequency of the word used as noun.

VerbFrequency

numeric vector with the frequency of the word used as verb.

CV

factor specifying whether the initial phoneme of the word is a consonant (C) or a vowel (V).

Obstruent

factor specifying whether the initial phoneme of the word is a continuant (cont) or an obstruent (obst).

Frication

factor specifying whether the initial phoneme has a burst (burst) or frication (frication) for consonant-initial words, and for vowel-initial words whether the vowel is long or short.

Voice

factor indicating whether the initial phoneme is voiced or voiceless.

FrequencyInitialDiphoneWord

numeric vector with the log-transformed frequency of the initial diphone given that it is syllable-initial.

FrequencyInitialDiphoneSyllable

numeric vector with the log-transformed frequency of the initial diphone given that it is word initial.

CorrectLexdec

numeric vector with the proportion of subjects that accepted the item as a word in lexical decision.

References

Balota, D., Cortese, M., Sergent-Marshall, S., Spieler, D. and Yap, M. (2004) Visual word recognition for single-syllable words, Journal of Experimental Psychology:General, 133, 283-316.

Baayen, R.H., Feldman, L. and Schreuder, R. (2006) Morphological influences on the recognition of monosyllabic monomorphemic words, Journal of Memory and Language, 53, 496-512.

Examples

Run this code
# NOT RUN {
data(english)

# ---- orthogonalize orthographic consistency measures

items = english[english$AgeSubject == "young",]
items.pca = prcomp(items[ , c(18:27)], center = TRUE, scale = TRUE)
x = as.data.frame(items.pca$rotation[,1:4])
items$PC1 =  items.pca$x[,1]
items$PC2 =  items.pca$x[,2]
items$PC3 =  items.pca$x[,3]
items$PC4 =  items.pca$x[,4]
items2 = english[english$AgeSubject != "young", ]
items2$PC1 =  items.pca$x[,1]
items2$PC2 =  items.pca$x[,2]
items2$PC3 =  items.pca$x[,3]
items2$PC4 =  items.pca$x[,4]
english = rbind(items, items2) 

# ---- add Noun-Verb frequency ratio

english$NVratio = log(english$NounFrequency+1)-log(english$VerbFrequency+1)

# ---- build model with ols() from rms

library(rms)
english.dd = datadist(english)
options(datadist = 'english.dd')

english.ols = ols(RTlexdec ~ Voice + PC1 + MeanBigramFrequency + 
   rcs(WrittenFrequency, 5) + rcs(WrittenSpokenFrequencyRatio, 3) + 
   NVratio + WordCategory + AgeSubject +
   rcs(FamilySize, 3) + InflectionalEntropy + 
   NumberComplexSynsets + rcs(WrittenFrequency, 5) : AgeSubject,
   data = english, x = TRUE, y = TRUE)

# ---- plot partial effects

plot(Predict(english.ols))

# ---- validate the model

validate(english.ols, bw = TRUE, B = 200)

# }

Run the code above in your browser using DataLab