Learn R Programming

corpora (version 0.6)

chisq: Pearson's chi-squared statistic for frequency comparisons (corpora)

Description

This function computes Pearson's chi-squared statistic (often written as \(X^2\)) for frequency comparison data, with or without Yates' continuity correction. The implementation is based on the formula given by Evert (2004, 82).

Usage

chisq(k1, n1, k2, n2, correct = TRUE, one.sided=FALSE)

Value

The chi-squared statistic \(X^2\) corresponding to the specified data (or a vector of \(X^2\) values). This statistic has a

chi-squared distribution with \(df=1\) under the null hypothesis of equal proportions.

Arguments

k1

frequency of a type in the first corpus (or an integer vector of type frequencies)

n1

the sample size of the first corpus (or an integer vector specifying the sizes of different samples)

k2

frequency of the type in the second corpus (or an integer vector of type frequencies, in parallel to k1)

n2

the sample size of the second corpus (or an integer vector specifying the sizes of different samples, in parallel to n1)

correct

if TRUE, apply Yates' continuity correction (default)

one.sided

if TRUE, compute the signed square root of \(X^2\) as a statistic for a one-sided test (see details below; the default value is FALSE)

Author

Stephanie Evert (https://purl.org/stephanie.evert)

Details

The \(X^2\) values returned by this function are identical to those computed by chisq.test. Unlike the latter, chisq accepts vector arguments so that a large number of frequency comparisons can be carried out with a single function call.

The one-sided test statistic (for one.sided=TRUE) is the signed square root of \(X^2\). It is positive for \(k_1/n_1 > k_2/n_2\) and negative for \(k_1/n_1 < k_2/n_2\). Note that this statistic has a standard normal distribution rather than a chi-squared distribution under the null hypothesis of equal proportions.

References

Evert, Stefan (2004). The Statistics of Word Cooccurrences: Word Pairs and Collocations. Ph.D. thesis, Institut f?r maschinelle Sprachverarbeitung, University of Stuttgart. Published in 2005, URN urn:nbn:de:bsz:93-opus-23714. Available from http://www.collocations.de/phd.html.

See Also

chisq.pval, chisq.test, cont.table

Examples

Run this code
chisq.test(cont.table(99, 1000, 36, 1000))
chisq(99, 1000, 36, 1000)

Run the code above in your browser using DataLab