fisher.pvalues.support: Computing Discrete P-Values and Their Supports for Fisher's Exact Test

Description

Computes discrete raw p-values and their support for Fisher's exact test applied to 2x2 contingency tables summarizing counts coming from two categorical measurements.

Note: This function is deprecated and will be removed in a future version. Please use generate.pvalues() with test.fun = DiscreteTests::fisher.test.pv and (optional) preprocess.fun = DiscreteDatasets::reconstruct_two or preprocess.fun = DiscreteDatasets::reconstruct_four instead. Alternatively, use a pipeline like
data |>
DiscreteDatasets::reconstruct_*(<args>) |>
DiscreteTests::fisher.test.pv(<args>)

Usage

fisher.pvalues.support(counts, alternative = "greater", input = "noassoc")

Value

A list of two elements:

raw: raw discrete p-values.
support: a list of the supports of the CDFs of the p-values. Each support is represented by a vector in increasing order.

Arguments

counts: a data frame of two or four columns and any number of lines; each line represents a 2x2 contingency table to test. The number of columns and what they must contain depend on the value of the input argument, see Details.
alternative: same argument as in stats::fisher.test(). The three possible values are "greater" (default), "two.sided" or "less" and you can specify just the initial letter.
input: the format of the input data frame, see Details. The three possible values are "noassoc" (default), "marginal" or "HG2011" and you can specify just the initial letter.

Details

Assume that each contingency tables compares two variables and resumes the counts of association or not with a condition. This can be resumed in the following table:

	Association	No association	Total
Variable 1	\(X_1\)	\(Y_1\)	\(N_1\)
Variable 2	\(X_2\)	\(Y_2\)	\(N_2\)
Total	\(X_1 + X_2\)	\(Y_1 + Y_2\)	\(N_1 + N_2\)

If input="noassoc", counts has four columns which respectively contain, \(X_1\), \(Y_1\), \(X_2\) and \(Y_2\). If input="marginal", counts has four columns which respectively contain \(X_1\), \(N_1\), \(X_2\) and \(N_2\).

If input="HG2011", we are in the situation of the amnesia data set as in Heller & Gur (2011, see References). Each contingency table is obtained from one variable which is compared to all other variables of the study. That is, counts for "second variable" are replaced by the sum of the counts of the other variables:

	Association	No association	Total
Variable \(j\)	\(X_j\)	\(Y_j\)	\(N_j\)
Variables \(\neq j\)	\(\sum_{i \neq j} X_i\)	\(\sum_{i \neq j} Y_i\)	\(\sum_{i \neq j} N_i\)
Total	\(\sum X_i\)	\(\sum Y_i\)	\(\sum N_i\)

Hence counts needs to have only two columns which respectively contain \(X_j\) and \(Y_j\).

The code for the computation of the p-values of Fisher's exact test is inspired by the example in the help page of p.discrete.adjust of package discreteMTP, which is no longer available on CRAN.

See the Wikipedia article about Fisher's exact test, paragraph Example, for a good depiction of what the code does for each possible value of alternative.

References

R. Heller and H. Gur (2011). False discovery rate controlling procedures for discrete tests. arXiv preprint. arXiv:1112.4627v2.

"Fisher's exact test", Wikipedia, The Free Encyclopedia, accessed 2024-12-14, link.

Examples

Run this code

X1 <- c(4, 2, 2, 14, 6, 9, 4, 0, 1)
X2 <- c(0, 0, 1, 3, 2, 1, 2, 2, 2)
N1 <- rep(148, 9)
N2 <- rep(132, 9)
Y1 <- N1 - X1
Y2 <- N2 - X2
df <- data.frame(X1, Y1, X2, Y2)
df

# Compute p-values and their supports of Fisher's exact test
df.formatted <- fisher.pvalues.support(counts = df, input = "noassoc")
raw.pvalues <- df.formatted$raw
pCDFlist <- df.formatted$support

Run the code above in your browser using DataLab