Learn R Programming

BayesVarSel (version 1.6.2)

PBvs: Bayesian Variable Selection for linear regression models using parallel computing.

Description

PBvs is a parallelized version of Bvs.

Usage

PBvs(formula, fixed.cov=c("Intercept"), data, prior.betas = "Robust", prior.models = "Constant", n.keep = 10, n.nodes = 2, priorprobs=NULL, time.test=TRUE)

Arguments

formula
Formula defining the most complex regression model in the analysis. See details.
fixed.cov
A character vector with the names of the covariates that will be considered as fixed (no variable selection over these). This argument provides an implicit definition of the simplest model considered. Default is "Intercept". Use NULL if selection should be performed over all the variables defined by formula
data
data frame containing the data.
prior.betas
Prior distribution for regression parameters within each model. Possible choices include "Robust", "Liangetal", "gZellner", "ZellnerSiow" and "FLS" (see details)
prior.models
Prior distribution over the model space. Possible choices are "Constant" and "ScottBerger" and "User" (see details)
n.keep
How many of the most probable models are to be kept?
n.nodes
Number of nodes to be used in the computation
priorprobs
A p+1 dimensional vector defining the prior probabilities Pr(M_i) (should be used in the case where prior.models="User"; see details.)
time.test
If TRUE a preliminary test to estimate computational time is performed.

Value

returns an object of class Bvs with the following elements:
time
The internal time consumed in solving the problem
lmfull
The lm class object that results when the model defined by formula is fitted by lm
lmnull
The lm class object that results when the model defined by fixed.cov is fitted by lm
variables
The name of all the potential explanatory variables (the set of variables to select from).
n
Number of observations
p
Number of explanatory variables to select from
k
Number of fixed variables
HPMbin
The binary expression of the Highest Posterior Probability model
modelsprob
A data.frame which summaries the n.keep most probable, a posteriori models, and their associated probability.
inclprob
A data.frame with the inclusion probabilities of all the variables.
jointinclprob
A data.frame with the joint inclusion probabilities of all the variables.
postprobdim
Posterior probabilities of the dimension of the true model
call
The call to the function
method
parallel

Details

This function takes advantage of the library parallel to distribute the models in the model space throughout the number of nodes available. Its intended use is for moderately large model spaces (p>=20).

A detailed description of the arguments can be found in the details section in Bvs.

References

Bayarri, M.J., Berger, J.O., Forte, A. and Garcia-Donato, G. (2012) Criteria for Bayesian Model choice with Application to Variable Selection. The Annals of Statistics. 40: 1550-1557

Fernandez, C., Ley, E. and Steel, M.F.J. (2001) Benchmark priors for Bayesian model averaging. Journal of Econometrics, 100, 381-427. Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger, J.O. (2008) Mixtures of g-priors for Bayesian Variable Selection. Journal of the American Statistical Association. 103:410-423. Zellner, A. and Siow, A. (1980) Posterior Odds Ratio for Selected Regression Hypotheses. In Bayesian Statistics 1 (J.M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.) 585-603. Valencia: University Press. Zellner, A. and Siow, A. (1984). Basic Issues in Econometrics. Chicago: University of Chicago Press. Zellner, A. (1986) On Assessing Prior Distributions and Bayesian Regression Analysis with g-prior Distributions. In Bayesian Inference and Decision techniques: Essays in Honor of Bruno de Finetti (A. Zellner, ed.) 389-399. Edward Elgar Publishing Limited.

See Also

plotBvs for different descriptive plots of the results.

GibbsBvs which implements a heuristic approximation to the problem based on Gibbs sampling.

Examples

Run this code
## Not run: 
# #Analysis of Crime Data
# #load data
# 
# data(UScrime)
# 
# #Default arguments are Robust prior for the regression parameters
# #and constant prior over the model space
# #Here we keep the 1000 most probable models a posteriori:
# #The computation over the model space is distributed over two
# #cores:
# crime.Bvs<- PBvs(formula="y~.", data=UScrime, n.keep=1000, 
# n.nodes=2)
# 
# #A look at the results:
# crime.Bvs
# 
# summary(crime.Bvs)
# 
# #An image plot with the joint inlcusion 
# #probabilities:
# plotBvs(crime.Bvs, option="joint")## End(Not run)

Run the code above in your browser using DataLab