gds: Greedy DAG Search to Estimate Markov Equivalence Class of DAG

Description

Estimate the observational or interventional essential graph representing the Markov equivalence class of a DAG by greedily optimizing a score function in the space of DAGs. In practice, greedy search should always be done in the space of equivalence classes instead of DAGs, giving the functions gies or ges the preference over gds.

Usage

gds(score, labels = score$getNodes(), targets = score$getTargets(),
    fixedGaps = NULL, phase = c("forward", "backward", "turning"), 
    iterate = length(phase) > 1, turning = TRUE, maxDegree = integer(0), 
    verbose = FALSE, ...)

Value

gds returns a list with the following two components:

essgraph: An object of class EssGraph containing an estimate of the equivalence class of the underlying DAG.
repr: An object of a class derived from ParDAG containing a (random) representative of the estimated equivalence class.

Arguments

score

An instance of a class derived from Score.

labels

Node labels; by default, they are determined from the scoring object.

targets

A list of intervention targets (cf. details). A list of vectors, each vector listing the vertices of one intervention target.

fixedGaps

Logical symmetric matrix of dimension p*p. If entry [i, j] is TRUE, the result is guaranteed to have no edge between nodes \(i\) and \(j\).

phase

Character vector listing the phases that should be used; possible values: forward, backward, and turning (cf. details).

iterate

Logical indicating whether the phases listed in the argument phase should be iterated more than once (iterate = TRUE) or not.

turning

Setting turning = TRUE is equivalent to setting phases = c("forward", "backward") and iterate = FALSE; the use of the argument turning is deprecated.

maxDegree

Parameter used to limit the vertex degree of the estimated graph. Valid arguments:

Vector of length 0 (default): vertex degree is not limited.
Real number \(r\), \(0 < r < 1\): degree of vertex \(v\) is limited to \(r \cdot n_v\), where \(n_v\) denotes the number of data points where \(v\) was not intervened.
Single integer: uniform bound of vertex degree for all vertices of the graph.
Integer vector of length p: vector of individual bounds for the vertex degrees.

verbose

if TRUE, detailed output is provided.

...

additional arguments for debugging purposes and fine tuning.

Author

Alain Hauser (alain.hauser@bfh.ch)

Details

This function estimates the observational or interventional Markov equivalence class of a DAG based on a data sample with interventional data originating from various interventions and possibly observational data. The intervention targets used for data generation must be specified by the argument targets as a list of (integer) vectors listing the intervened vertices; observational data is specified by an empty set, i.e. a vector of the form integer(0). As an example, if data contains observational samples as well as samples originating from an intervention at vertices 1 and 4, the intervention targets must be specified as list(integer(0), as.integer(1), as.integer(c(1, 4))).

An interventional Markov equivalence class of DAGs can be uniquely represented by a partially directed graph called interventional essential graph. Its edges have the following interpretation:

a directed edge \(a \longrightarrow b\) stands for an arrow that has the same orientation in all representatives of the interventional Markov equivalence class;
an undirected edge a -- b stands for an arrow that is oriented in one way in some representatives of the equivalence class and in the other way in other representatives of the equivalence class.

Note that when plotting the object, undirected and bidirected edges are equivalent.

Greedy DAG search (GDS) maximizes a score function (typically the BIC, passed to the function via the argument score) of a DAG in three phases, starting from the empty DAG:

Forward phase: In the forward phase, GDS adds single arrows to the DAG as long as this augments the score.
Backward phase: In the backward phase, the algorithm removes arrows from the DAG as long as this augments the score.
Turning phase: In the turning phase, the algorithm reverts arrows of the DAG as long as this augments the score.

The phases that are actually run are specified with the argument phase. GDS cycles through the specified phases until no augmentation of the score is possible any more if iterate = TRUE. In the end, gds returns the (interventional or observational) essential graph of the last visited DAG.

It is well-known that a greedy search in the space of DAGs instead of essential graphs is more prone to be stuck in local optima of the score function and hence expected to yield worse estimation results than GIES (function gies) or GES (function ges) (Chickering, 2002; Hauser and Bühlmann, 2012). The function gds is therefore not of practical use, but can be used to compare causal inference algorithms to an elementary and straight-forward approach.

References

D.M. Chickering (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research 3, 507--554

A. Hauser and P. Bühlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409--2464.

Examples

Run this code

## Load predefined data
data(gmInt)

## Define the score (BIC)
score <- new("GaussL0penIntScore", gmInt$x, gmInt$targets, gmInt$target.index) 

## Estimate the essential graph
gds.fit <- gds(score) 

## Plot the estimated essential graph and the true DAG
if (require(Rgraphviz)) {
  par(mfrow=c(1,2))
  plot(gds.fit$essgraph, main = "Estimated ess. graph")
  plot(gmInt$g, main = "True DAG")
}

Run the code above in your browser using DataLab