cpquery(fitted, event, evidence, cluster = NULL, method = "ls", ...,
debug = FALSE)
cpdist(fitted, nodes, evidence, cluster = NULL, method = "ls", ...,
debug = FALSE)mutilated(x, evidence)
bn.fit
.bn
or bn.fit
.parallel integration
for details and a simple example.ls
, the
default) and likelihood weighting (lw
) are implemented.TRUE
a lot of debugging output is
printed; otherwise the function is completely silent.cpquery
returns a numeric value, the conditional probability of
event
conditional on evidence
. cpdist
returns a data frame containing the observations generated from
the conditional distribution of the nodes
conditional on
evidence
. The data frame has class c("bn.cpdist", "data.frame")
,
and a method
attribute storing the value of the method
argument.
In the case of likelihood weighting, the weights are also attached as an
attribute called weights
. mutilated
returns a bn
or bn.fit
object, depending on the
class of x
.event
and evidence
arguments must be two expressions
describing the event of interest and the conditioning evidence in a format
such that, if we denote with data
the data set the network was learned
from, data[evidence, ]
and data[event, ]
return the correct
observations. If either event
or evidence
is set to TRUE
an unconditional probability query is performed with respect to that argument. Three tuning parameters are available: n
: a positive integer number, the number of random observations
to generate from fitted
. The default value is
5000 * log10(nparams.fitted(fitted))
for discrete and coditional
Gaussian networks and 500 * nparams.fitted(fitted)
for Gaussian
networks. batch
: a positive integer number, the size of each batch of
random observations. Defaults to 10^4
. query.nodes
: a a vector of character strings, the labels of
the nodes involved in event
and evidence
. Simple queries do
not require to generate observations from all the nodes in the network,
so cpquery
and cpdist
try to identify which nodes are used
in event
and evidence
and reduce the network to their upper
closure. query.nodes
may be used to manually specify these nodes
when automatic identification fails; there is no reason to use it
otherwise. cpdist
is always
smaller than n
, because logic sampling is a form of rejection sampling.
Therefore, only the obervations matching evidence
(out of the n
that are generated) are returned, and their number depends on the probability
of evidence
.event
argument must be an expression describing the event of
interest, as in logic sampling. The evidence
argument must be a named
list:
event
or evidence
is set to TRUE
an
unconditional probability query is performed with respect to that argument. Tuning parameters are the same as for logic sampling: n
, batch
and query.nodes
. Note that the observations returned by cpdist
are generated from the
mutilated network, and need to be weighted appropriately when computing
summary statistics (for more details, see the references below).
cpquery
does that automatically when computing the final conditional
probability. Also note that the batch
argument is ignored in cpdist
for speed and memory efficiency.cpquery
estimates the conditional probability of event
given
evidence
using the method specified in the method
argument. cpdist
generates random observations conditional on the
evidence
using the method specified in the method
argument. mutilated
constructs the mutilated network used for sampling in
likelihood weighting. Note that both cpquery
and cpdist
are based on Monte Carlo
particle filters, and therefore they may return slightly different values
on different runs.## discrete Bayesian network (it is the same with ordinal nodes).
data(learning.test)
fitted = bn.fit(hc(learning.test), learning.test)
# the result should be around 0.025.
cpquery(fitted, (B == "b"), (A == "a"))
# for a single observation, predict the value of a single
# variable conditional on the others.
var = names(learning.test)
obs = 2
str = paste("(", names(learning.test)[-3], "=='",
sapply(learning.test[obs,-3], as.character), "')",
sep = "", collapse = " & ")
str
str2 = paste("(", names(learning.test)[3], "=='",
as.character(learning.test[obs, 3]), "')", sep = "")
str2
cpquery(fitted, eval(parse(text = str2)), eval(parse(text = str)))
# do the same with likelihood weighting
cpquery(fitted, event = eval(parse(text = str2)),
evidence = as.list(learning.test[2, -3]), method = "lw")
# conditional distribution of A given C == "c".
table(cpdist(fitted, "A", (C == "c")))
## Gaussian Bayesian network.
data(gaussian.test)
fitted = bn.fit(hc(gaussian.test), gaussian.test)
# the result should be around 0.04.
cpquery(fitted,
event = ((A >= 0) & (A <= 1)) & ((B >= 0) & (B <= 3)),
evidence = (C + D < 10))
Run the code above in your browser using DataLab