Learn R Programming

pcalg (version 2.7-12)

binCItest: G square Test for (Conditional) Independence of Binary Variables

Description

\(G^2\) test for (conditional) independence of binary variables \(X\) and \(Y\) given the (possibly empty) set of binary variables \(S\).

binCItest() is a wrapper of gSquareBin(), to be easily used in skeleton, pc and fci.

Usage

gSquareBin(x, y, S, dm, adaptDF = FALSE, n.min = 10*df, verbose = FALSE)
binCItest (x, y, S, suffStat)

Value

The p-value of the test.

Arguments

x,y

(integer) position of variable \(X\) and \(Y\), respectively, in the adjacency matrix.

S

(integer) positions of zero or more conditioning variables in the adjacency matrix.

dm

data matrix (with \(\{0,1\}\) entries).

adaptDF

logical specifying if the degrees of freedom should be lowered by one for each zero count. The value for the degrees of freedom cannot go below 1.

n.min

the smallest \(n\) (number of observations, nrow(dm)) for which the G^2 test is computed; for smaller \(n\), independence is assumed (\(G^2 := 1\)) with a warning. The default is \(10 m\), where \(m\) is the degrees of freedom assuming no structural zeros, \(2^{|S|}\).

verbose

logical or integer indicating that increased diagnostic output is to be provided.

suffStat

a list with two elements, "dm", and "adaptDF" corresponding to the above two arguments of gSquareBin().

Author

Nicoletta Andri and Markus Kalisch (kalisch@stat.math.ethz.ch)

Details

The \(G^2\) statistic is used to test for (conditional) independence of X and Y given a set S (can be NULL). This function is a specialized version of gSquareDis which is for discrete variables with more than two levels.

References

R.E. Neapolitan (2004). Learning Bayesian Networks. Prentice Hall Series in Artificial Intelligence. Chapter 10.3.1

See Also

gSquareDis for a (conditional) independence test for discrete variables with more than two levels.

dsepTest, gaussCItest and disCItest for similar functions for a d-separation oracle, a conditional independence test for Gaussian variables and a conditional independence test for discrete variables, respectively.

skeleton, pc or fci which need a testing function such as binCItest.

Examples

Run this code
n <- 100
set.seed(123)
## Simulate *independent data of {0,1}-variables:
x <- rbinom(n, 1, pr=1/2)
y <- rbinom(n, 1, pr=1/2)
z <- rbinom(n, 1, pr=1/2)
dat <- cbind(x,y,z)

binCItest(1,3,2, list(dm = dat, adaptDF = FALSE)) # 0.36, not signif.
binCItest(1,3,2, list(dm = dat, adaptDF = TRUE )) # the same, here

## Simulate data from a chain of 3 variables: x1 -> x2 -> x3
set.seed(12)
b0 <- 0
b1 <- 1
b2 <- 1
n <- 10000
x1 <- rbinom(n, size=1, prob=1/2) ## = sample(c(0,1), n, replace=TRUE)

## NB:  plogis(u) := "expit(u)" := exp(u) / (1 + exp(u))
p2 <- plogis(b0 + b1*x1) ; x2 <- rbinom(n, 1, prob = p2) # {0,1}
p3 <- plogis(b0 + b2*x2) ; x3 <- rbinom(n, 1, prob = p2) # {0,1}

ftable(xtabs(~ x1+x2+x3))
dat <- cbind(x1,x2,x3)

## Test marginal and conditional independencies
gSquareBin(3,1,NULL,dat, verbose=TRUE)
gSquareBin(3,1, 2,  dat)
gSquareBin(1,3, 2,  dat) # the same
gSquareBin(1,3, 2,  dat, adaptDF=TRUE, verbose = 2)

stopifnot(all.equal(gSquareBin(3,1, 2, dat),
                    gSquareBin(1,3, 2, dat)),
          all.equal(gSquareBin(3,1, 2, dat, adaptDF=TRUE),
                    gSquareBin(1,3, 2, dat, adaptDF=TRUE)))

Run the code above in your browser using DataLab