cpquery: Perform conditional probability queries

Description

Perform conditional probability queries (CPQs).

Usage

cpquery(fitted, event, evidence, cluster = NULL, method = "ls", ...,
  debug = FALSE)
cpdist(fitted, nodes, evidence, cluster = NULL, method = "ls", ...,
  debug = FALSE)

Arguments

fitted

an object of class bn.fit.

event, evidence

see below.

nodes

a vector of character strings, the labels of the nodes whose conditional distribution we are interested in.

cluster

an optional cluster object from package snow. See snow integration for details and a simple example.

method

a character string, the method used to perform the conditional probability query. Currently only Logic Sampling is implemented.

...

additional tuning parameters.

debug

a boolean value. If TRUE a lot of debugging output is printed; otherwise the function is completely silent.

Value

cpquery returns a numeric value, the conditional probability of event conditional on evidence.
cpudist returns a data frame containing the observations generated from the conditional distribution of the nodes conditional on evidence.

Logic Sampling

The event and evidence arguments must be two expressions describing the event of interest and the conditioning evidence in a format such that, if we denote with data the data set the network was learned from, data[evidence, ] and data[event, ] return the correct observations. If either parameter is equal to TRUE an unconditional probability query is performed.

Two tuning parameters are available:

n: a positive integer number, the number of random observations to generate fromfitted. Defaults to5000 * nparams(fitted).
batch: a positive integer number, the size of each batch of random observations. Defaults to10^4.

Note that the number of observations returned by cpdist is always smaller than n, because logic sampling is a form of rejection sampling. Therefore, conly the obervations matching evidence (out of the n that are generated) are returned, and their number depends on the probability of evidence.

References

Koller D, Friedman N (2009). Probabilistic Graphical Models: Principles and Techniques. MIT Press.

Korb K, Nicholson AE (2010). Bayesian Artificial Intelligence. Chapman & Hall/CRC, 2nd edition.

Examples

Run this code

## discrete Bayesian network.
fitted = bn.fit(hc(learning.test), learning.test)
# the result should be around 0.025.
cpquery(fitted, (B == "b"), (A == "a"))
# for a single observation, predict the value of a single
# variable conditional on the others.
var = names(learning.test)
obs = 2
str = paste("(", names(learning.test)[-3], "=='",
        sapply(learning.test[obs,-3], as.character), "')", 
        sep = "", collapse = "& ")
str
str2 = paste("(", names(learning.test)[3], "=='",
         as.character(learning.test[obs, 3]), "')", sep = "")
str2
cpquery(fitted, eval(parse(text = str2)), eval(parse(text = str)))
# conditional distribution of A given C == "c". 
table(cpdist(fitted, "A", (C == "c")))

## Gaussian Bayesian network.
fitted = bn.fit(hc(gaussian.test), gaussian.test)
# the result should be around 0.04.
cpquery(fitted, 
  event = ((A >= 0) & (A <= 1)) & ((B >= 0) & (B <= 3)),
  evidence = (C + D < 10))

Run the code above in your browser using DataLab