optAdjSet: Compute the optimal adjustment set

Description

optAdjSet computes the optimal valid adjustment set relative to the variables (X,Y) in the given graph.

Usage

optAdjSet(graphEst,x.pos,y.pos)

Value

A vector with the positions of the nodes of the optimal set O(X,Y,G).

Arguments

graphEst: graphNel object or adjacency matrix of type amat.cpdag.
x.pos, x: Positions of variables X in the covariance matrix.
y.pos, y: Positions of variables Y in the covariance matrix.

Author

Leonard Henckel

Details

Suppose we have data from a linear SEM compatible with a known causal graph G and our aim is to estimate the total joint effect of X on Y. Here the total joint effect of X \(= (X_1,X_2)\) on Y is defined via Pearl's do-calculus as the vector \((E[Y|do(X_1=x_1+1,X_2=x_2)]-E[Y|do(X_1=x_1,X_2=x_2)], E[Y|do(X_1=x_1,X_2=x_2+1)]-E[Y|do(X_1=x_1,X_2=x_2)])\), with a similar definition for more than two variables. These values are equal to the partial derivatives (evaluated at \(x_1,x_2\)) of \(E[Y|do(X=x_1',X_2=x_2')]\) with respect to \(x_1\)' and \(x_2\)'. Moreover, under the linearity assumption, these partial derivatives do not depend on the values at which they are evaluated.

It is possible to estimate the total joint effect of X on Y with a simple linear regression of the form lm(Y ~ X + Z), if and only if the covariate set Z is a valid adjustment set (see Perkovic et al. (2018)). Often, however, there are multiple such valid adjustment sets, providing total effect estimates with varying accuracies. Suppose that there exists a valid adjustment set relative to (X,Y) in causal graph G, and each node in Y is a descendant of X, then there exists a valid adjustment which provides the total effect estimate with the optimal asymptotic variance, which we will refer to as O(X,Y,G) (Henckel et al., 2019). This function returns this optimal valid adjustment set O(X,Y,G).

The restriction that each node in Y be a descendant of the node set X is not notable, as the total effect of the node set X on a non-descendant is always 0. If provided with a node set Y that does not fulfill this condition this function computes a pruned node set Y2 by removing all nodes from Y that are not descendants of X and returns O(X,Y2,G) instead. The user will be alerted to this and given the pruned set Y2.

References

E. Perković, J. Textor, M. Kalisch and M.H. Maathuis (2018). Complete graphical characterization and construction of adjustment sets in Markov equivalence classes of ancestral graphs. Journal of Machine Learning Research. 18(220) 1--62,

L. Henckel, E. Perkovic and M.H. Maathuis (2019). Graphical criteria for efficient total effect estimation via adjustment in causal linear models. Working Paper.

Examples

Run this code

## Simulate a true DAG, its CPDAG and an intermediate max. PDAG
suppressWarnings(RNGversion("3.5.0"))
set.seed(123)
p <- 10
## true DAG
myDAG <- randomDAG(p, prob = 0.3) 
## true CPDAG
myCPDAG <- dag2cpdag(myDAG) 
## true PDAG with added background knowledge 5 -> 6
myPDAG <- addBgKnowledge(myCPDAG,5,6) 
if (require(Rgraphviz)) {
par(mfrow = c(1,3))
plot(myDAG)
plot(myPDAG)
plot(myCPDAG) ## plot of the graphs
}

## if the CPDAG C is amenable relative to (X,Y),
## the optimal set will be the same for all DAGs 
## and any max. PDAGs obtained by adding background knowledge to C 
(optAdjSet(myDAG,3,10))
(optAdjSet(myPDAG,3,10))
(optAdjSet(myCPDAG,3,10))


## the optimal adjustment set can also be compute for sets X and Y
(optAdjSet(myDAG,c(3,4),c(9,10)))
(optAdjSet(myPDAG,c(3,4),c(9,10)))
(optAdjSet(myCPDAG,c(3,4),c(9,10)))

## The only restriction is that it requires all nodes in Y to be
## descendants of X.
## However, if a node in Y is non-descendant of X the lowest variance
## partial total effect estimate is simply 0.
## Hence, we can proceed with a pruned Y. This function does this automatically!
optAdjSet(myDAG,1,c(3,9))

## Note that for sets X there may be no valid adjustment set even
## if the PDAG is is amenable relative to (X,Y).
if (FALSE) optAdjSet(myPDAG,c(4,9),7)

Run the code above in your browser using DataLab