Learn R Programming

spikeSlabGAM (version 1.1-19)

spikeAndSlab: Set up and sample a spike-and-slab prior model.

Description

This function sets up a spike-and-slab model for variable selection and model choice in generalized additive models and samples its posterior. It uses a blockwise Metropolis-within-Gibbs sampler and the redundant multiplicative parameter expansion described in the reference. This routine is not meant to be called directly by the user -- spikeSlabGAM provides a formula-based interface for specifying models and takes care of (most of) the housekeeping. Sampling of the chains is done in parallel using package parallel. A "SOCK" cluster is set up under Windows to do so (and closed after computations are done, I try to clean up after myself), see makeCluster etc. Use options(mc.cores =<foo>) to set the (maximal) number of processes forked by the parallelization. If options()$mc.cores is unspecified, it is set to 2.

Usage

spikeAndSlab(
  y,
  X,
  family = c("gaussian", "binomial", "poisson"),
  hyperparameters = list(),
  model = list(),
  mcmc = list(),
  start = list()
)

Arguments

y

response

X

design matrix

family

(character) the family of the response, defaults to normal/Gaussian response

hyperparameters

a list of hyperparameters controlling the priors (see details)

model

a list with information about the grouping structure of the model (see details)

mcmc

(optional) list setting arguments for the sampler (see details)

start

(optional) list containing the starting values for \(\beta, \gamma, \tau^2, \sigma^2, w\) and, optionally, the random seed

Value

a list with components:

formula

see arguments

data

see arguments

family

see arguments

y

see arguments

X

see arguments

hyperparameters

see arguments

model

see arguments

mcmc

see arguments

start

see arguments

posteriorPred

a list with entries mu and y containing samples of the expected values and realizations of the response from the posterior predictive

postMeans

a list containing the posterior means of the parameters:

beta

the regression coefficients

alpha

ksi

tau

hypervariances of the penalized model terms

gamma

inclusion indicator variables of the model terms

pV1

\(P(\gamma = 1)\)

w

hyperparameter for gamma

sigma2

error variance (for Gaussian data)

logLik

log likelihood

logPost

log of (unnormalized) posterior

samples

a list containing the posterior samples of the parameters, see above for explanation of the entries

DIC

a vector with \(DIC, pD, \bar{D},\hat{D}\). Usually doesn't make much sense for this kind of model because of the posterior's multimodality.

fitted

a matrix with the posterior mean of the linear predictor in the first column and the posterior mean of the expected response in the second.

runTime

of the sampler, in seconds

Details

Details for model specification:

hyperparameters

w

hyperparameters for the \(Beta\)-prior for \(w\); defaults to c(alphaW = 1, betaW = 1), i.e. a uniform distribution.

tau2

hyperparameters for the \(\Gamma^{-1}\)-prior of the hypervariances \(\tau^2\); defaults to c(a1 = 5, a2 = 25)

gamma

sets \(v_0\), the ratio between the spike and slab variances, defaults to c(v0 = 0.00025)

sigma2

hyperparameters for \(\Gamma^{-1}\)-prior for error variance; defaults to c(b1 = 1e-4, b2 = 1e-4). Only relevant for Gaussian response.

varKsi

variance for prior of \(\xi\), defaults to 1

ksiDF

defaults to 0 for a gaussian prior for \(\xi\), else induces a t-prior for \(\xi\)

with ksiDF degrees of freedom.

model

groupIndicators

a factor that maps the columns of X to the different model terms

H

a matrix containing the hierarchy of the penalized model terms

n

number of observations

q

length of \(\beta\)

scale

scale/weights of the response, defaults to rep(1, n), use this to specify number of trials for binomial data

offset

defaults to rep(0, n)

mcmc

nChains

how many parallel chains to run: defaults to 3

chainLength

how many samples should be generated per chain, defaults to 500

burnin

how many initial iterations should be discarded, defaults to 100

thin

save only every thin-th iteration, defaults to 5

verbose

verbose output and report progress? defaults to TRUE

returnSamples

defaults to TRUE

sampleY

generate samples of y and its conditional expectation from posterior predictive? defaults to FALSE

useRandomStart

use random draw or ridge estimate for beta as starting value? defaults to TRUE, i.e. random starting values.

blocksize

approx. blocksizes of the updates for \(\alpha, \xi\). Defaults to 50 for gaussian responses and 5/15 for non-gaussian responses.

scalemode

how to do term-wise rescaling of subvectors of \(\xi\) in each iteration: 0 means no rescaling, 1 means rescaling s.t. each mean\((|\xi_g|) = 1\), 2 means rescaling s.t. each max\((|\xi_g|) = 1\)

modeSwitching

probability to do P-IWLS with the mode of the proposal set to the current value, which is useful if the chain gets stuck. Defaults to \(0.05\). Increase this if acceptance rates are too low.

reduceRet

don't return data and samples for \(\alpha, \xi, \tau^2\)? defaults to FALSE

start

gamma

starting values for \(\gamma\). Defaults to a vector of 1's if mcmc$useRandomStart is FALSE, otherwise drawn from the prior.

tau2

starting values for \(\tau^2\). Defaults to the mode of the prior if mcmc$useRandomStart is FALSE, otherwise drawn from the prior.

sigma2

starting values for \(\sigma^2\). Only relevant for Gaussian response. Defaults to the variance of the response divided by the number of covariates if mcmc$useRandomStart is FALSE, otherwise drawn from the prior.

w

starting value for \(w\). Defaults to the mean of the prior if mcmc$useRandomStart is FALSE, otherwise drawn from the prior.

seed

Sets RNG seed for reproducible results. Parallel chains are seeded with this seed incremented by the number of the chain.

References

Scheipl, F. (2010) Normal-Mixture-of-Inverse-Gamma Priors for Bayesian Regularization and Model Selection in Structured Additive Regression Models. LMU Munich, Department of Statistics: Technical Reports, No.84 (https://epub.ub.uni-muenchen.de/11785/)