arc.strength: Measure arc strength

Description

Measure the strength of the probabilistic relationships expressed by the arcs of a Bayesian network, and use model averaging to build a network containing only the significant arcs.

Usage

# strength of the arcs present in x.
arc.strength(x, data, criterion = NULL, ..., debug = FALSE)
# strength of all possible arcs, as learned from bootstrapped data.
boot.strength(data, R = 200, m = nrow(data),
  algorithm, algorithm.args = list(), cpdag = TRUE, debug = FALSE)
# strength of all possible arcs, from a list of custom networks.
custom.strength(networks, nodes, weights = NULL, cpdag = TRUE,
  debug = FALSE)
# averaged network structure.
averaged.network(strength, nodes, threshold)

Arguments

an object of class bn.

networks

a list, containing either object of class bn or arc sets (matrices or data frames with two columns, optionally labeled "from" and "to").

data

a data frame containing the data the Bayesian network was learned from.

strength

an object of class bn.strength, see below.

threshold

a numeric value, the minimum strength required for an arc to be included in the averaged network. The default value is the threshold attribute of the strength argument.

nodes

a vector of character strings, the labels of the nodes in the network. In averaged.network, it defaults to the set of the unique node labels in the strength argument.

criterion

a character string, the label of a score function, the label of an independence test or bootstrap. See bnlearn-package for details on the first two possibilities.

a positive integer, the number of bootstrap replicates.

a positive integer, the size of each bootstrap replicate.

weights

a vector of non-negative numbers, to be used as weights when averaging network structures to compute strength coefficients. If NULL, weights are assumed to be uniform.

cpdag

a boolean value. If TRUE the (PDAG of) the equivalence class is used instead of the network structure itself. It should make it easier to identify score-equivalent arcs.

algorithm

a character string, the learning algorithm to be applied to the bootstrap replicates. Possible values are gs, iamb, fast.iamb, inter.iamb, mmpc, hc, tabu

algorithm.args

a list of extra arguments to be passed to the learning algorithm.

...

additional tuning parameters for the network score (if criterion is the label of a score function, see score for details), the conditional independence test (currently the only one is

debug

a boolean value. If TRUE a lot of debugging output is printed; otherwise the function is completely silent.

Value

arc.strength, boot.strength and custom.strength return an object of class bn.strength; boot.strength and custom.strength also include information about the relative probabilities of arc directions.
averaged.network returns an object of class bn.
See bn.strength class and bn-class for details.

Details

If criterion is a conditional independence test, the strength is a p-value (so the lower the value, the stronger the relationship). The only possible additional parameter is B, the number of permutations to be generated for each permutation test.

If criterion is the label of a score function, the strength is measured by the score gain/loss which would be caused by the arc's removal. There may be additional parameters depending on the choice of the score, see score for details.

If criterion is bootstrap, the strength is computed as in boot.strength. The additional parameters are R, m, algorithm and algorithm.args; if the latter two are not specified, the values stored in x are used.

Model averaging is supported for objects of class bn.strength returned by boot.strength, by custom.strength, or by arc.strength with criterion set to bootstrap. The returned network contains the arcs whose strength is greater than the threshold attribute of the bn.strength object passed to averaged.network.

References

for model averaging and boostrap strength (confidence):

Friedman N, Goldszmidt M, Wyner A (1999). "Data Analysis with Bayesian Networks: A Bootstrap Approach". In "UAI '99: Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence", pp. 196-20. Morgan Kaufmann.

for the computation of the strength (confidence) significance threshold:

Scutari M, Nagarajan R (2011). "On Identifying Significant Edges in Graphical Models". In "Proceedings of the Workshop 'Probabilistic Problem Solving in Biomedicine' of the 13th Artificial Intelligence in Medicine (AIME) Conference", pp. 15-27.

Examples

Run this code

data(learning.test)
res = gs(learning.test)
res = set.arc(res, "A", "B")
arc.strength(res, learning.test)
#   from to      strength
# 1    A  B  0.000000e+00
# 2    A  D  0.000000e+00
# 3    B  E 1.024198e-320
# 4    C  D  0.000000e+00
# 5    F  E 3.935648e-245
arcs = boot.strength(learning.test, algorithm = "hc")
arcs[(arcs$strength > 0.85) & (arcs$direction >= 0.5), ]
#    from to strength direction
# 1     A  B        1       0.5
# 3     A  D        1       1.0
# 6     B  A        1       0.5
# 9     B  E        1       1.0
# 13    C  D        1       1.0
# 30    F  E        1       1.0
averaged.network(arcs)
#
#   Random/Generated Bayesian network
#
#   model:
#    [A][C][F][B|A][D|A:C][E|B:F]
#   nodes:                                 6
#   arcs:                                  5
#     undirected arcs:                     0
#     directed arcs:                       5
#   average markov blanket size:           2.33
#   average neighbourhood size:            1.67
#   average branching factor:              0.83
#
#   generation algorithm:                  Model Averaging
#   significance threshold:                0.025

start = random.graph(nodes = names(learning.test), num = 50)
netlist = lapply(start, function(net) {
  hc(learning.test, score = "bde", iss = 10, start = net) })
arcs = custom.strength(netlist, nodes = names(learning.test),
         cpdag = FALSE)
arcs[(arcs$strength > 0.85) & (arcs$direction >= 0.5), ]
#    from to strength direction
# 1     A  B        1      1.00
# 3     A  D        1      1.00
# 9     B  E        1      0.98
# 13    C  D        1      0.96
# 30    F  E        1      0.66
modelstring(averaged.network(arcs))
# [1] "[A][C][F][B|A][D|A:C][E|B:F]"

Run the code above in your browser using DataLab