Bvs(formula, fixed.cov=c("Intercept"), data, prior.betas = "Robust", prior.models = "Constant", n.keep = 10, time.test = TRUE, priorprobs=NULL)
formula
prior.models
= "User"; see details.)Bvs
returns an object of class Bvs
with the following elements:
lm
class object that results when the model defined by formula
is fitted by lm
lm
class object that results when the model defined by fixed.cov
is fitted by lm
data.frame
which summaries the n.keep
most probable, a posteriori models, and their associated probability.data.frame
with the inclusion probabilities of all the variables.data.frame
with the joint inclusion probabilities of all the variables.call
to the functionfull
formula
. The simplest of such models, M0, contains only the intercept. Then Bvs
provides exact summaries of the posterior distribution over this model space, that is, summaries of the discrete distribution which assigns to each model Mi its probability given the data: Pr(Mi | data
)=Pr(Mi)*Bi/C,
where Bi is the Bayes factor of Mi to M0, Pr(Mi) is the prior probability of Mi and C is the normalizing constant.
The Bayes factor B_i depends on the prior assigned for the regression parameters in Mi and Bvs
implements a number of popular choices plus the "Robust" prior recently proposed by Bayarri et al (2012). The "Robust" prior is the default choice for both theoretical (see the reference for details) and computational reasons since it produces Bayes factors with closed-form expressions. The "gZellner" prior implemented corresponds to the prior in Zellner (1986) with g=n while the "Liangetal" prior is the hyper-g/n with a=3 (see the original paper Liang et al 2008, for details). "ZellnerSiow" is the multivariate Cauchy prior proposed by Zellner and Siow (1980, 1984), further studied by Bayarri and Garcia-Donato (2007). Finally, "FLS" is the prior recommended by Fernandez, Ley and Steel (2001) which is the prior in Zellner (1986) with g=max(n, p*p) p being the number of covariates to choose from (the most complex model has p+number of fixed covariates).
With respect to the prior over the model space Pr(Mi) three possibilities are implemented: "Constant", under which every model has the same prior probability, "ScottBerger" under which Pr(Mi) is inversely proportional to the number of models of that dimension, and "User" (see below). The "ScottBerger" prior was studied by Scott and Berger (2010) and controls for multiplicity.
When the parameter prior.models
="User", the prior probabilities are defined through the p+1 dimensional parameter vector priorprobs
. Let k be the number of explanatory variables in the simplest model (the one defined by fixed.cov
) then except for the normalizing constant, the first component of priorprobs
must contain the probability of each model with k covariates (there is only one); the second component of priorprobs
should contain the probability of each model with k+1 covariates and so on. Finally, the p+1 component in priorprobs
defined the probability of the most complex model (that defined by formula
. That is
priorprobs
[j]=C*Pr(M_i such that M_i has j-1+k explanatory variables)
where C is the normalizing constant, i.e C=1/sum(priorprobs*choose(p,0:p)
.
Note that prior.models
="Constant" is equivalent to the combination prior.models
="User" and priorprobs=rep(1,(p+1))
but the internal functions are not the same and you can obtain small variations in results due to these differences in the implementation.
Similarly, prior.models
= "ScottBerger" is equivalent to the combination prior.models
= "User" and priorprobs
=
1/choose(p,0:p)
.
Limitations: the error "A Bayes factor is infinite.". Bayes factors can be extremely big numbers if i) the sample size is even moderately large or if ii) a model is much better (in terms of fit) than the model taken as the null model. We are currently working on more robust implementations of the functions to handle these problems. In the meanwhile you could try using the g-Zellner prior (which is the most simple one and results, in these cases, should not vary much with the prior) and/or using more accurate definitions of the simplest model (via the fixed.cov
argument).
Bayarri, M.J. and Garcia-Donato, G. (2007)
Fernandez, C., Ley, E. and Steel, M.F.J. (2001)
Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger,J.O. (2008)
plotBvs
for several plots of the result.PBvs
for a parallelized version of Bvs
.
GibbsBvs
for a heuristic approximation based on Gibbs sampling (recommended when p>20, no other possibilities when p>31).
## Not run:
# #Analysis of Crime Data
# #load data
# data(UScrime)
#
# #Default arguments are Robust prior for the regression parameters
# #and constant prior over the model space
# #Here we keep the 1000 most probable models a posteriori:
# crime.Bvs<- Bvs(formula="y~.", data=UScrime, n.keep=1000)
#
# #A look at the results:
# crime.Bvs
#
# summary(crime.Bvs)
#
# #A plot with the posterior probabilities of the dimension of the
# #true model:
# plotBvs(crime.Bvs, option="dimension")
#
# #Two image plots of the conditional inclusion probabilities:
# plotBvs(crime.Bvs, option="conditional")
# plotBvs(crime.Bvs, option="not")
# ## End(Not run)
Run the code above in your browser using DataLab