score: Score of the Bayesian network

Description

Compute the score of the Bayesian network.

Usage

# S4 method for bn
score(x, data, type = NULL, ..., by.node = FALSE, debug = FALSE)
# S4 method for bn.naive
score(x, data, type = NULL, ..., by.node = FALSE, debug = FALSE)
# S4 method for bn.tan
score(x, data, type = NULL, ..., by.node = FALSE, debug = FALSE)
# S3 method for bn
logLik(object, data, ...)
# S3 method for bn
AIC(object, data, ..., k = 1)
# S3 method for bn
BIC(object, data, ...)

Arguments

x, object

an object of class bn.

data

a data frame containing the data the Bayesian network that will be used to compute the score.

type

a character string, the label of a network score. If none is specified, the default score is the Bayesian Information Criterion for both discrete and continuous data sets. See network scores for details.

by.node

a boolean value. If TRUE and the score is decomposable, the function returns the score terms corresponding to each node; otherwise it returns their sum (the overall score of x).

debug

a boolean value. If TRUE a lot of debugging output is printed; otherwise the function is completely silent.

…

extra arguments from the generic method (for the AIC and logLik functions, currently ignored) or additional tuning parameters (for the score function).

a numeric value, the penalty coefficient to be used; the default k = 1 gives the expression used to compute the AIC in the context of scoring Bayesian networks.

Value

For score() with by.node = TRUE, a vector of numeric values, the individual node contributions to the score of the Bayesian network. Otherwise, a single numeric value, the score of the Bayesian network.

Details

Additional arguments of the score() function:

iss: the imaginary sample size used by the Bayesian Dirichlet scores (bde, mbde, bds, bdj). It is also known as “equivalent sample size”. The default value is equal to 1.
iss.mu: the imaginary sample size for the normal component of the normal-Wishart prior in the Bayesian Gaussian score (bge). The default value is 1.
iss.w: the imaginary sample size for the Wishart component of the normal-Wishart prior in the Bayesian Gaussian score (bge). The default value is ncol(data) + 2.
nu: the mean vector of the normal component of the normal-Wishart prior in the Bayesian Gaussian score (bge). The default value is equal to colMeans(data).
l: the number of scores to average in the locally averaged Bayesian Dirichlet score (bdla). The default value is 5.
exp: a list of indexes of experimental observations (those that have been artificially manipulated). Each element of the list must be named after one of the nodes, and must contain a numeric vector with indexes of the observations whose value has been manipulated for that node.
k: the penalty coefficient to be used by the AIC and BIC scores. The default value is 1 for AIC and log(nrow(data))/2 for BIC.
prior: the prior distribution to be used with the various Bayesian Dirichlet scores (bde, mbde, bds, bdj, bdla) and the Bayesian Gaussian score (bge). Possible values are uniform (the default), vsp (the Bayesian variable selection prior, which puts a probability of inclusion on parents), marginal (an independent marginal uniform for each arc) and cs (the Castelo & Siebes prior, which puts an independent prior probability on each arc and direction).
beta: the parameter associated with prior.
- If prior is uniform, beta is ignored.
- If prior is vsp, beta is the probability of inclusion of an additional parent. The default is 1/ncol(data).
- If prior is marginal, beta is the probability of inclusion of an arc. Each direction has a probability of inclusion of beta / 2 and the probability that the arc is not included is therefore 1 - beta. The default value is 0.5, so that arc inclusion and arc exclusion have the same probability.
- If prior is cs, beta is a data frame with columns from, to and prob specifying the prior probability for a set of arcs. A uniform probability distribution is assumed for the remaining arcs.
newdata: the test set whose predictive likelihood will be computed by pred-loglik, pred-loglik-g or pred-loglik-cg. It should be a data frame with the same variables as data.

Examples

Run this code

# NOT RUN {
data(learning.test)
res = set.arc(gs(learning.test), "A", "B")
score(res, learning.test, type = "bde")

## let's see score equivalence in action!
res2 = set.arc(gs(learning.test), "B", "A")
score(res2, learning.test, type = "bde")

## K2 score on the other hand is not score equivalent.
score(res, learning.test, type = "k2")
score(res2, learning.test, type = "k2")

## BDe with a prior.
beta = data.frame(from = c("A", "D"), to = c("B", "F"),
         prob = c(0.2, 0.5), stringsAsFactors = FALSE)
score(res, learning.test, type = "bde", prior = "cs", beta = beta)

## equivalent to logLik(res, learning.test)
score(res, learning.test, type = "loglik")

## equivalent to AIC(res, learning.test)
score(res, learning.test, type = "aic")
# }

Run the code above in your browser using DataLab