cramerVFit: Cramer's V for chi-square goodness-of-fit tests

Description

Calculates Cramer's V for a vector of counts and expected counts; confidence intervals by bootstrap.

Usage

cramerVFit(
  x,
  p = rep(1/length(x), length(x)),
  ci = FALSE,
  conf = 0.95,
  type = "perc",
  R = 1000,
  histogram = FALSE,
  digits = 4,
  reportIncomplete = FALSE,
  verbose = FALSE,
  ...
)

Arguments

A vector of observed counts.

A vector of expected or default probabilities.

If TRUE, returns confidence intervals by bootstrap. May be slow.

conf

The level for the confidence interval.

type

The type of confidence interval to use. Can be any of "norm", "basic", "perc", or "bca". Passed to boot.ci.

The number of replications to use for bootstrap.

histogram

If TRUE, produces a histogram of bootstrapped values.

digits

The number of significant digits in the output.

reportIncomplete

If FALSE (the default), NA will be reported in cases where there are instances of the calculation of the statistic failing during the bootstrap procedure.

verbose

If TRUE, prints additional statistics.

...

Additional arguments passed to chisq.test.

Value

A single statistic, Cramer's V. Or a small data frame consisting of Cramer's V, and the lower and upper confidence limits.

Details

This modification of Cramer's V could be used to indicate an effect size in cases where a chi-square goodness-of-fit test might be used. It indicates the degree of deviation of observed counts from the expected probabilities.

In the case of equally-distributed expected frequencies, Cramer's V will be equal to 1 when all counts are in one category, and it will be equal to 0 when the counts are equally distributed across categories. This does not hold if the expected frequencies are not equally-distributed.

Because V is always positive, if type="perc", the confidence interval will never cross zero, and should not be used for statistical inference. However, if type="norm", the confidence interval may cross zero.

When V is close to 0 or 1, or with small counts, the confidence intervals determined by this method may not be reliable, or the procedure may fail.

In addition, the function will not return a confidence interval if there are zeros in any cell.

References

http://rcompanion.org/handbook/H_03.html

Examples

Run this code

# NOT RUN {
### Equal probabilities example
### From http://rcompanion.org/handbook/H_03.html
nail.color = c("Red", "None", "White", "Green", "Purple", "Blue")
observed   = c( 19,    3,      1,       1,       2,        2    )
expected   = c( 1/6,   1/6,    1/6,     1/6,     1/6,      1/6  )
chisq.test(x = observed, p = expected)
cramerVFit(x = observed, p = expected)

### Unequal probabilities example
### From http://rcompanion.org/handbook/H_03.html
race = c("White", "Black", "American Indian", "Asian", "Pacific Islander",
          "Two or more races")
observed = c(20, 9, 9, 1, 1, 1)
expected = c(0.775, 0.132, 0.012, 0.054, 0.002, 0.025)
chisq.test(x = observed, p = expected)
cramerVFit(x = observed, p = expected)

### Examples of perfect and zero fits
cramerVFit(c(100, 0, 0, 0, 0))
cramerVFit(c(10, 10, 10, 10, 10))

# }

Run the code above in your browser using DataLab