This function computes Pearson's chi-squared statistic (often written as \(X^2\)) for frequency comparison data, with or without Yates' continuity correction. The implementation is based on the formula given by Evert (2004, 82).
chisq(k1, n1, k2, n2, correct = TRUE, one.sided=FALSE)
The chi-squared statistic \(X^2\) corresponding to the specified data (or a vector of \(X^2\) values). This statistic has a
chi-squared distribution with \(df=1\) under the null hypothesis of equal proportions.
frequency of a type in the first corpus (or an integer vector of type frequencies)
the sample size of the first corpus (or an integer vector specifying the sizes of different samples)
frequency of the type in the second corpus (or an integer
vector of type frequencies, in parallel to k1
)
the sample size of the second corpus (or an integer vector
specifying the sizes of different samples, in parallel to
n1
)
if TRUE
, apply Yates' continuity correction
(default)
if TRUE
, compute the signed square root
of \(X^2\) as a statistic for a one-sided test (see details below;
the default value is FALSE
)
Stephanie Evert (https://purl.org/stephanie.evert)
The \(X^2\) values returned by this function are identical to those
computed by chisq.test
. Unlike the latter, chisq
accepts vector arguments so that a large number of frequency
comparisons can be carried out with a single function call.
The one-sided test statistic (for one.sided=TRUE
) is the signed
square root of \(X^2\). It is positive for \(k_1/n_1 > k_2/n_2\)
and negative for \(k_1/n_1 < k_2/n_2\). Note that this statistic
has a standard normal distribution rather than a chi-squared
distribution under the null hypothesis of equal proportions.
Evert, Stefan (2004). The Statistics of Word Cooccurrences: Word Pairs and Collocations. Ph.D. thesis, Institut f?r maschinelle Sprachverarbeitung, University of Stuttgart. Published in 2005, URN urn:nbn:de:bsz:93-opus-23714. Available from http://www.collocations.de/phd.html.
chisq.pval
, chisq.test
,
cont.table
chisq.test(cont.table(99, 1000, 36, 1000))
chisq(99, 1000, 36, 1000)
Run the code above in your browser using DataLab