Learn R Programming

SciencesPo (version 1.3.9)

calc.UC: The Uncertainty Coefficient

Description

The uncertainty coefficient U(C|R) measures the proportion of uncertainty (entropy) in the column variable Y that is explained by the row variable X.

Usage

calc.UC(x, y = NULL, direction = c("symmetric", "row", "column"),
  conf.level = NA, p.zero.correction = 1/sum(x)^2, ...)

## S3 method for class 'default': calc.UC(x, y = NULL, direction = c("symmetric", "row", "column"), conf.level = NA, p.zero.correction = 1/sum(x)^2, ...)

Arguments

x
A numeric vector, a factor, matrix or data frame.
y
A vector that is ignored if x is a matrix and required if x is a vector.
direction
The direction of the calculation, either "symmetric" (default), "row", or "column". "row" calculates uncertainty(R|C) (column dependent relationship).
conf.level
The confidence level of the interval. If set to NA (which is the default) no confidence interval will be calculated.
p.zero.correction
Slightly nudge zero values so that their logarithm can be calculated.
...
Further arguments are passed to the function table, allowing i.e. to set useNA. This refers only to the vector interface.

Details

The uncertainty coefficient is computed as $$U(C|R) = \frac{H(X) + H(Y) - H(XY)}{H(Y)}$$ and ranges from [0, 1].

References

Theil, H. (1972), Statistical Decomposition Analysis, Amsterdam: North-Holland Publishing Company.

Examples

Run this code
if (interactive()) {
# example from Goodman Kruskal (1954)
m <- as.table(cbind(c(1768,946,115), c(807,1387,438), c(189,746,288), c(47,53,16)));
dimnames(m) <- list(paste("A", 1:3), paste("B", 1:4));
print(m)

calc.UC(m); # default is direction = "symmetric"

calc.UC(m, conf.level=0.95); # direction "symmetric"

calc.UC(m, direction="column");
}

Run the code above in your browser using DataLab