Learn R Programming

pcalg (version 2.7-12)

ages: Estimate an APDAG within the Markov equivalence class of a DAG using AGES

Description

Estimate an APDAG (a particular PDAG) using the aggregative greedy equivalence search (AGES) algorithm, which uses the solution path of the greedy equivalence search (GES) algorithm of Chickering (2002).

Usage

ages(data, lambda_min = 0.5 * log(nrow(data)), labels = NULL,
     fixedGaps = NULL, adaptive = c("none", "vstructures", "triples"),
     maxDegree = integer(0), verbose = FALSE, ...)

Value

ages returns a list with the following four components:

essgraph

An object of class EssGraph containing an estimate of the equivalence class of the underlying DAG.

repr

An object of a class derived from ParDAG containing a (random) representative of the estimated equivalence class.

CPDAGsList

A list of p*p matrices containing all CPDAGs considered by AGES in the aggregation processes

lambda

A vector containing the penalty parameter used to obtain the list of CPDAGs mentioned above. GES returns the list of CPDAGs when used with this vector of penalty parameters if used with phases = c("forward", "backward") and iterate = FALSE.

Arguments

data

A \(n*p\) matrix (or data frame) containing the observational data.

lambda_min

The smallest penalty parameter value used when computing the solution path of GES.

labels

Node labels; if NULL the names of the columns of the data matrix (or the names in the data frame) are used. If these are not specified the sequence 1 to p is used.

fixedGaps

logical symmetric matrix of dimension p*p. If entry [i, j] is TRUE, the result is guaranteed to have no edge between nodes \(i\) and \(j\).

adaptive

indicating whether constraints should be adapted to newly detected v-structures or unshielded triples (cf. details).

maxDegree

Parameter used to limit the vertex degree of the estimated graph. Valid arguments:

  1. Vector of length 0 (default): vertex degree is not limited.

  2. Real number \(r\), \(0 < r < 1\): degree of vertex \(v\) is limited to \(r \cdot n_v\), where \(n_v\) denotes the number of data points where \(v\) was not intervened.

  3. Single integer: uniform bound of vertex degree for all vertices of the graph.

  4. Integer vector of length p: vector of individual bounds for the vertex degrees.

verbose

If TRUE, detailed output is provided.

...

Additional arguments for debugging purposes and fine tuning.

Author

Marco Eigenmann (eigenmann@stat.math.ethz.ch)

Details

This function tries to add orientations to the essential graph (CPDAG) found by ges (ran with lambda=lambda_min). It does it aggregating several CPDAGs present in the solution path of GES. Conceptually, AGES starts with the essential graph found by GES ran with lambda = lambda_min. Then, it checks for further (compatible) orientation information in other essential graphs present in the solution path of GES, i.e., in essential graphs outputted by GES for larger penalty parameters. With compatible we mean that the aggregation process is done such that the final APDAG is still within the Markov equivalence graph represented by the essential graph found by GES in the following sense: an APDAG can always be extended to a DAG without creating new v-structures. This DAG lies in the Markov equivalence class represented by the essential graph found by GES. The algorithm is explained in detail in Eigenmann, Nandy, and Maathuis (2017).

The arguments fixedgaps and adaptive work also with AGES. However, they have not been studied in Eigenmann, Nandy, and Maathuis (2017). Using the argument fixedGaps, one can make sure that certain edges will not be present in the resulting essential graph: if the entry [i, j] of the matrix passed to fixedGaps is TRUE, there will be no edge between nodes \(i\) and \(j\). The argument adaptive can be used to relax the constraints encoded by fixedGaps according to a modification of GES called ARGES (adaptively restricted greedy equivalence search) which has been presented in Nandy, Hauser and Maathuis (2018):

  • When adaptive = "vstructures" and the algorithm introduces a new v-structure \(a \longrightarrow b \longleftarrow c\) in the forward phase, then the edge \(a - c\) is removed from the list of fixed gaps, meaning that the insertion of an edge between \(a\) and \(c\) becomes possible even if it was forbidden by the initial matrix passed to fixedGaps.

  • When adaptive = "triples" and the algorithm introduces a new unshielded triple in the forward phase (i.e., a subgraph of three nodes \(a\), \(b\) and \(c\) where \(a\) and \(b\) as well as \(b\) and \(c\) are adjacent, but \(a\) and \(c\) are not), then the edge \(a - c\) is removed from the list of fixed gaps.

With one of the adaptive modifications, the successive application of a skeleton estimation method and GES restricted to an estimated skeleton still gives a consistent estimator of the DAG, which is not the case without the adaptive modification.

For a detailed explanation of the GES function as well as its related object like essential graphs, we refer to the ges function.

Differences in the arguments with respect to GES: AGES uses data to initialize several scores taken as argument by GES. AGES modifies the forward and backward phases of GES performing single steps in either directions. For this reason, phase, iterate, and turning are not available arguments.

References

D.M. Chickering (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research 3, 507--554

M.F. Eigenmann, P. Nandy, and M.H. Maathuis (2017). Structure learning of linear Gaussian structural equation models with weak edges. In Proceedings of UAI 2017

P. Nandy, A. Hauser and M.H. Maathuis (2018). High-dimensional consistency in score-based and hybrid structure learning. Annals of Statistics, to appear.

See Also

ges, EssGraph

Examples

Run this code

## Example 1: ages adds correct orientations: Bar --> V6 and Bar --> V8

set.seed(77)

p <- 8
n <- 5000
## true DAG:
vars <- c("Author", "Bar", "Ctrl", "Goal", paste0("V",5:8))
gGtrue <- randomDAG(p, prob = 0.3, V = vars)
data = rmvDAG(n, gGtrue)


## Estimate the aggregated PDAG with ages
ages.fit <- ages(data = data)


## Estimate the essential graph with ges
## We specify the phases in order to have a fair comparison of the algorithms
## Without the phases specified it would be easy to find examples
## where each algorithm outperforms the other
score <- new("GaussL0penObsScore", data)
ges.fit <- ges(score, phase = c("forward","backward"), iterate = FALSE)

## Plots
if (require(Rgraphviz)) {
par(mfrow=c(1,3))
plot(ges.fit$essgraph, main="Estimated CPDAG with GES")
plot(ages.fit$essgraph, main="Estimated APDAG with AGES")
plot(gGtrue, main="TrueDAG")
}


## Example 2: ages adds correct orientations: Author --> Goal and Author --> V5

set.seed(50)

p <- 9
n <- 5000
## true DAG:
vars <- c("Author", "Bar", "Ctrl", "Goal", paste0("V",5:9))
gGtrue <- randomDAG(p, prob = 0.5, V = vars)
data = rmvDAG(n, gGtrue)


## Estimate the aggregated PDAG with ages
ages.fit <- ages(data = data)


## Estimate the essential graph with ges
## We specify the phases in order to have a fair comparison of the algorithms
## Without the phases specified it would be easy to find examples
## where each algorithm outperforms the other
score <- new("GaussL0penObsScore", data)
ges.fit <- ges(score, phase = c("forward","backward"), iterate = FALSE)

## Plots
if (require(Rgraphviz)) {
par(mfrow=c(1,3))
plot(ges.fit$essgraph, main="Estimated CPDAG with GES")
plot(ages.fit$essgraph, main="Estimated APDAG with AGES")
plot(gGtrue, main="TrueDAG")
}

## Example 3: ges and ages return the same graph

data(gmG)

data <- gmG8$x

## Estimate the aggregated PDAG with ages
ages.fit <- ages(data = data)


## Estimate the essential graph with ges
score <- new("GaussL0penObsScore", data)
ges.fit <- ges(score)


## Plots
if (require(Rgraphviz)) {
par(mfrow=c(1,3))
plot(ges.fit$essgraph, main="Estimated CPDAG with GES")
plot(ages.fit$essgraph, main="Estimated APDAG with AGES")
plot(gmG8$g, main="TrueDAG")
}

Run the code above in your browser using DataLab