BayesFactor: Bayes factors and posterior probabilities for linear regression models

Description

Computes the Bayes factors and posterior probabilities of a list of linear regression models proposed to explain a common response variable over the same dataset

Usage

BayesFactor(models, data, prior.betas = "Robust",  prior.models = "Constant", priorprobs=NULL)

Arguments

models

A named list with the entertained models defined with their corresponding formulas. One model must be nested in all the others.

data

data frame containing the data.

prior.betas

Prior distribution for regression parameters within each model. Possible choices include "Robust", "Liangetal", "gZellner", "ZellnerSiow" and "FLS" (see details).

prior.models

Prior probabilities of the models. Possible choices are "Constant" and "User" (see details).

priorprobs

A named list (same length and names as in argument models) with the prior probabilities of the models.)

Value

BFio: A vector with the Bayes factor of each model to the simplest model.
PostProbi: A vector with the posterior probabilities of each model.
models: A list with the entertained models.
nullmodel: The position of the simplest model.

Details

The Bayes factors, Bi, are expressed in relation with the simplest model (the one nested in all the others). Then, the posterior probabilities of the entertained models are obtained as

Pr(Mi | data)=Pr(Mi)*Bi/C,

where Pr(Mi) is the prior probability of model Mi and C is the normalizing constant.

The Bayes factor B_i depends on the prior assigned for the regression parameters in Mi.

BayesFactor implements a number of popular choices plus the "Robust" prior recently proposed by Bayarri et al (2012). The "Robust" prior is the default choice for both theoretical (see the reference for details) and computational reasons since it produces Bayes factors with closed-form expressions. The "gZellner" prior implemented corresponds to the prior in Zellner (1986) with g=n while the "Liangetal" prior is the hyper-g/n with a=3 (see the original paper Liang et al 2008, for details). "ZellnerSiow" is the multivariate Cauchy prior proposed by Zellner and Siow (1980, 1984), further studied by Bayarri and Garcia-Donato (2007). Finally, "FLS" is the prior recommended by Fernandez, Ley and Steel (2001) which is the prior in Zellner (1986) with g=max(n, p*p) p being the difference between the dimension of the most complex model and the simplest one.

With respect to the prior over the model space Pr(Mi) three possibilities are implemented: "Constant", under which every model has the same prior probability and "User". With this last option, the prior probabilities are defined through the named list priorprobs. These probabilities can be given unnormalized.

Limitations: the error "A Bayes factor is infinite.". Bayes factors can be extremely big numbers if i) the sample size is even moderately large or if ii) a model is much better (in terms of fit) than the model taken as the null model. We are currently working on more robust implementations of the functions to handle these problems. In the meanwhile you could try using the g-Zellner prior (which is the most simple one and results, in these cases, should not vary much with the prior) and/or using more accurate definitions of the simplest model.

References

Bayarri, M.J., Berger, J.O., Forte, A. and Garcia-Donato, G. (2012) Criteria for Bayesian Model choice with Application to Variable Selection. The Annals of Statistics. 40: 1550-1557.

Bayarri, M.J. and Garcia-Donato, G. (2007) Extending conventional priors for testing general hypotheses in linear models. Biometrika, 94:135-152. Barbieri, M and Berger, J (2004) Optimal Predictive Model Selection. The Annals of Statistics, 32, 870-897.

Fernandez, C., Ley, E. and Steel, M.F.J. (2001) Benchmark priors for Bayesian model averaging. Journal of Econometrics, 100, 381-427. Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger,J.O. (2008) Mixtures of g-priors for Bayesian Variable Selection. Journal of the American Statistical Association. 103:410-423 Zellner, A. and Siow, A. (1980) Posterior Odds Ratio for Selected Regression Hypotheses. In Bayesian Statistics 1 (J.M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.) 585-603. Valencia: University Press. Zellner, A. and Siow, A. (1984) Basic Issues in Econometrics. Chicago: University of Chicago Press. Zellner, A. (1986) On Assessing Prior Distributions and Bayesian Regression Analysis with g-prior Distributions. In Bayesian Inference and Decision techniques: Essays in Honor of Bruno de Finetti (A. Zellner, ed.) 389-399. Edward Elgar Publishing Limited.

Examples

Run this code

## Not run: 
# #Analysis of Crime Data
# #load data
# data(UScrime)
# #Model selection among the following models: (note model1 is nested in all the others)
# model1<- as.formula("y~1+Prob")
# model2<- as.formula("y~1+Prob+Time")
# model3<- as.formula("y~1+Prob+Po1+Po2")
# model4<- as.formula("y~1+Prob+So")
# model5<- as.formula("y~.")
# 
# #Equal prior probabilities for models:
# crime.BF<- BayesFactor(models=list(basemodel=model1, 
# 	ProbTimemodel=model2, ProbPolmodel=model3, 
# 	ProbSomodel=model4, fullmodel=model5), data=UScrime)
# 
# #Another configuration of prior probabilities of models:
# crime.BF2<- BayesFactor(models=list(basemodel=model1, ProbTimemodel=model2, 
# 	ProbPolmodel=model3, ProbSomodel=model4, fullmodel=model5), 
# 	data=UScrime, prior.models = "User", priorprobs=list(basemodel=1/8,
# 	ProbTimemodel=1/8, ProbPolmodel=1/2, ProbSomodel=1/8, fullmodel=1/8))
# #same as:
# #crime.BF2<- BayesFactor(models=list(basemodel=model1, ProbTimemodel=model2, 
# 	#ProbPolmodel=model3,ProbSomodel=model4, #fullmodel=model5), data=UScrime, 
# 	#prior.models = "User", priorprobs=list(basemodel=1, ProbTimemodel=1, 
# 	#ProbPolmodel=4, #ProbSomodel=1, fullmodel=1))
# ## End(Not run)

Run the code above in your browser using DataLab