ContingencyTests: Independence in Three-Way Contingency Tables

Description

Testing the independence of two possibly ordered factors, eventually stratified by a third factor.

Usage

## S3 method for class 'formula':
cmh_test(formula, data, subset = NULL, weights = NULL, \dots)
## S3 method for class 'table':
cmh_test(object, distribution = c("asymptotic", "approximate"), ...)
## S3 method for class 'IndependenceProblem':
cmh_test(object, distribution = c("asymptotic", "approximate"), ...)
## S3 method for class 'formula':
chisq_test(formula, data, subset = NULL, weights = NULL, \dots)
## S3 method for class 'table':
chisq_test(object, distribution = c("asymptotic", "approximate"), ...)
## S3 method for class 'IndependenceProblem':
chisq_test(object, distribution = c("asymptotic", "approximate"), ...)
## S3 method for class 'formula':
lbl_test(formula, data, subset = NULL, weights = NULL, \dots)
## S3 method for class 'table':
lbl_test(object, distribution = c("asymptotic", "approximate"), ...)
## S3 method for class 'IndependenceProblem':
lbl_test(object, distribution = c("asymptotic", "approximate"), ...)

Arguments

formula

a formula of the form y ~ x | block where y and x are factors (possibly ordered) and block is an optional factor for stratification.

data

an optional data frame containing the variables in the model formula.

subset

an optional vector specifying a subset of observations to be used.

weights

an optional formula of the form ~ w defining integer valued weights for the observations.

object

an object inheriting from class "IndependenceProblem" or an object of class table.

distribution

a character, the null distribution of the test statistic can be approximated by its asymptotic distribution ("asymptotic") or via Monte-Carlo resampling ("approximate"). Alternatively, the functions

...

further arguments to be passed to or from methods.

Value

An object inheriting from class IndependenceTest-class with methods show, statistic, expectation, covariance and pvalue. The null distribution can be inspected by pperm, dperm, qperm and support methods.

Details

The null hypothesis of the independence of y and x is tested, block defines an optional factor for stratification. chisq_test implements Pearson's chi-squared test, cmh_test the Cochran-Mantel-Haenzsel test and lbl_test the linear-by-linear association test for ordered data.

In case either x or y are ordered factors, the corresponding linear-by-linear association test is performed by all the procedures. lbl_test coerces factors to class ordered under any circumstances. The default scores are 1:nlevels(x) and 1:nlevels(y), respectively. The default scores can be changed via the scores argument (see independence_test), for example scores = list(y = 1:3, x = c(1, 4, 6)) first triggers a coercion to class ordered of both variables and attaches the list elements as scores to the corresponding factors. The length of a score vector needs to be equal the number of levels of the factor of interest.

The authoritative source for details on the documented test procedures is Agresti (2002).

References

Alan Agresti (2002), Categorical Data Analysis. Hoboken, New Jersey: John Wiley & Sons.

Examples

Run this code

set.seed(290875)

  ### for females only
  chisq_test(as.table(jobsatisfaction[,,"Female"]), 
      distribution = approximate(B = 9999))

  ### both Income and Job.Satisfaction unordered
  cmh_test(jobsatisfaction)

  ### both Income and Job.Satisfaction ordered, default scores
  lbl_test(jobsatisfaction)

  ### both Income and Job.Satisfaction ordered, alternative scores
  lbl_test(jobsatisfaction, scores = list(Job.Satisfaction = c(1, 3, 4, 5),
                                          Income = c(3, 10, 20, 35)))

  ### the same, null distribution approximated
  cmh_test(jobsatisfaction, scores = list(Job.Satisfaction = c(1, 3, 4, 5),
                                        Income = c(3, 10, 20, 35)),
           distribution = approximate(B = 10000))

  ### Smoking and HDL cholesterin status
  ### (from Jeong, Jhun and Kim, 2005, CSDA 48, 623-631, Table 2)
  smokingHDL <- as.table(
      matrix(c(15,  8, 11,  5, 
                3,  4,  6,  1, 
                6,  7, 15, 11, 
                1,  2,  3,  5), ncol = 4,
             dimnames = list(smoking = c("none", "< 5", "< 10", ">=10"), 
                             HDL = c("normal", "low", "borderline", "abnormal"))
  ))
  ### use interval mid-points as scores for smoking
  lbl_test(smokingHDL, scores = list(smoking = c(0, 2.5, 7.5, 15)))

  ### Cochran-Armitage trend test for proportions
  ### Lung tumors in female mice exposed to 1,2-dichloroethane
  ### Encyclopedia of Biostatistics (Armitage & Colton, 1998), 
  ### Chapter Trend Test for Counts and Proportions, page 4578, Table 2
  lungtumor <- data.frame(dose = rep(c(0, 1, 2), c(40, 50, 48)),
                          tumor = c(rep(c(0, 1), c(38, 2)),
                                    rep(c(0, 1), c(43, 7)),
                                    rep(c(0, 1), c(33, 15))))
  table(lungtumor$dose, lungtumor$tumor)

  ### Cochran-Armitage test (permutation equivalent to correlation 
  ### between dose and tumor), cf. Table 2 for results
  independence_test(tumor ~ dose, data = lungtumor, teststat = "quad")

  ### linear-by-linear association test with scores 0, 1, 2
  ### is identical with Cochran-Armitage test
  lungtumor$dose <- ordered(lungtumor$dose)
  independence_test(tumor ~ dose, data = lungtumor, teststat = "quad",
                    scores = list(dose = c(0, 1, 2)))

Run the code above in your browser using DataLab