ei.MD.bayes: Multinomial Dirichlet model for Ecological Inference in RxC tables

Description

Implements a version of the hierarchical model suggested in Rosen et al. (2001)

Usage

ei.MD.bayes(formula, covariate = NULL, total = NULL, data, 
            lambda1 = 4, lambda2 = 2, covariate.prior.list = NULL,
            tune.list = NULL, start.list = NULL, sample = 1000, thin = 1, 
            burnin = 1000, verbose = 0, ret.beta = 'r', 
            ret.mcmc = TRUE, usrfun = NULL)

Arguments

formula

A formula of the form

cbind(col1, col2, ...) ~
      cbind(row1, row2, ...)

. Column and row marginals must have the same totals.

covariate

An optional formula of the form ~ covariate. The default is covariate = NULL, which fits the model without a covariate.

total

if row and/or column marginals are given as proportions, total identifies the name of the variable in data containing the total number of individuals in each unit

data

A data frame containing the variables specified in formula and total

lambda1

The shape parameter for the gamma prior (defaults to 4)

lambda2

The rate parameter for the gamma prior (defaults to 2)

covariate.prior.list

a list containing the parameters for normal prior distributions on delta and gamma for model with covariate. See `details' for more information.

tune.list

A list containing tuning parameters for each block of parameters. See `details' for more information. Typically, this will be a list generated by tuneMD. The default is NULL

start.list

A list containing starting values for each block of parameters. See `details' for more information. The default is start.list = NULL, which generates appropriate random starting values.

sample

Number of draws to be saved from chain and returned as output from the function (defaults to 1000). The total length of the chain is sample*thin + burnin.

thin

an integer specifying the thinning interval for posterior draws (defaults to 1, but most problems will require a much larger thinning interval).

burnin

integer specifying the number of initial iterations to be discarded (defaults to 1000, but most problems will require a longer burnin).

verbose

an integer specifying whether the progress of the sampler is printed to the screen (defaults to 0). If verbose is greater than 0, the iteration number is printed to the screen every verboseth iteration.

ret.beta

A character indicating how the posterior draws of beta should be handled: `r'eturn as an R object, `s'ave as .txt.gz files, `d'iscard (defaults to r).

ret.mcmc

A logical value indicating how the samples from the posterior should be returned. If TRUE (default), samples are returned as coda mcmc objects. If FALSE, samples are returned as arrays.

usrfun

the name of an optional a user-defined function to obtain quantities of interest while drawing from the MCMC chain (defaults to NULL).

Value

A list containing
draws
A list containing samples from the posterior distribution of the parameters. If a covariate is included in the model, the list contains:
- Dr
{Posterior draws for Dr parameters as an R $\times$sample matrix. If ret.mcmc = TRUE, Dr is an mcmc object.} Beta{Posterior draws for beta parameters. Only returned if ret.beta = TRUE. If ret.mcmc = TRUE, a (R * C * units) $\times$ sample matrix saved as an mcmc object. Otherwise, a R $\times$ C $\times$ units $\times$ sample array} Gamma{Posterior draws for gamma parameters. If ret.mcmc = TRUE, a (R * (C - 1)) $\times$ sample matrix saved as an mcmc object. Otherwise, a R $\times$ (C - 1) $\times$ sample array} Delta{Posterior draws for delta parameters. If ret.mcmc = TRUE, a (R * (C - 1)) $\times$ sample matrix saved as an mcmc object. Otherwise, a R $\times$(C - 1) $\times$ sample array} Cell.count{Posterior draws for the cell counts, summed across units. If ret.mcmc = TRUE, a (R * C) $\times$ sample matrix saved as an mcmc object. Otherwise, a R $\times$ C $\times$ sample array}
If the model is fit without a covariate, the list includes:
- Alpha
{Posterior draws for alpha parameters. If ret.mcmc = TRUE, a (R * C) $\times$ sample matrix saved as an mcmc object. Otherwise, a R $\times$ C $\times$ sample array}
BetaPosterior draws for beta parameters. If ret.mcmc = TRUE, a (R * C * units) $\times$ sample matrix saved as an mcmc object. Otherwise, a R $\times$ C $\times$ units $\times$ sample array
Cell.countPosterior draws for the cell counts, summed across units. If ret.mcmc = TRUE, a (R * C) $\times$ sample matrix saved as anmcmc object. Otherwise, a R $\times$ C $\times$ sample array

item

acc.ratios
beta.acc
gamma.acc
beta.acc
usrfun
call
start.betas
start.gamma
start.delta
start.betas
tune.beta
tune.gamma
tune.delta
tune.beta

itemize

tune.alpha

code

ei.MD.bayes

eqn

$R \times (C-1)$

Details

ei.MD.bayes implements a version of the hierarchical Multinomial-Dirichlet model for ecological inference in $R \times C$ tables suggested by Rosen et al. (2001).

Let $r = 1, \ldots, R$ index rows, $C = 1, \ldots, C$ index columns, and $i = 1, \ldots, n$ index units. Let $N_{\cdot ci}$ be the marginal count for column $c$ in unit $i$ and $X_{ri}$ be the marginal proportion for row $r$ in unit $i$. Finally, let $\beta_{rci}$ be the proportion of row $r$ in column $c$ for unit $i$.

The first stage of the model assumes that the vector of column marginal counts in unit $i$ follows a Multinomial distribution of the form:

$$(N_{\cdot 1i}, \ldots, N_{\cdot Ci}) {\sim} {\rm Multinomial}(N_i,\sum_{r=1}^R \beta_{r1i}X_{ri}, \dots, \sum_{r=1}^R \beta_{rCi}X_{ri})$$

The second stage of the model assumes that the vector of $\beta$ for row $r$ in unit $i$ follows a Dirichlet distribution with $C$ parameters. The model may be fit with or without a covariate.

If the model is fit without a covariate, the distribution of the vector $\beta_{ri}$ is : $$(\beta_{r1i}, \dots, \beta_{rCi}) {\sim} {\rm Dirichlet}(\alpha_{r1}, \dots, \alpha_{rC})$$

In this case, the prior on each $\alpha_{rc}$ is assumed to be:

$$\alpha_{rc} \sim {\rm Gamma}(\lambda_1, \lambda_2)$$

If the model is fit with a covariate, the distribution of the vector $\beta_{ri}$ is : $$(\beta_{r1i}, \dots, \beta_{rCi}) {\sim} {\rm Dirichlet}(d_r\exp(\gamma_{r1} + \delta_{r1}Z_i), d_r\exp(\gamma_{r(C-1)} + \delta_{r(C-1)}Z_i), d_r)$$

The parameters $\gamma_{rC}$ and $\delta_{rC}$ are constrained to be zero for identification. (In this function, the last column entered in the formula is so constrained.)

Finally, the prior for $d_r$ is:

$$d_r \sim {\rm Gamma}(\lambda_1, \lambda_2)$$

while $\gamma_{rC}$ and $\delta_{rC}$ are given improper uniform priors if covariate.prior.list = NULL or have independent normal priors of the form:

$$\delta_{rC} \sim {\rm N}(\mu_{\delta_{rC}}, \sigma_{\delta_{rC}}^2)$$

$$\gamma_{rC} \sim {\rm N}(\mu_{\gamma_{rC}}, \sigma_{\gamma_{rC}}^2)$$

If the user wishes to estimate the model with proper normal priors on $\gamma_{rC}$ and $\delta_{rC}$, a list with four elements must be provided for covariate.prior.list:

mu.delta

{an $R \times (C-1)$ matrix of prior means for Delta} sigma.delta{an $R \times (C-1)$ matrix of prior standard deviations for Delta} mu.gamma{an $R \times (C-1)$ matrix of prior means for Gamma} sigma.gamma{an $R \times (C-1)$ matrix of prior standard deviations for Gamma}

References

Martyn Plummer, Nicky Best, Kate Cowles, and Karen Vines. 2002. Output Analysis and Diagnostics for MCMC (CODA). http://www-fis.iarc.fr/coda/.

Ori Rosen, Wenxin Jiang, Gary King, and Martin A. Tanner. 2001. ``Bayesian and Frequentist Inference for Ecological Inference: The $R \times (C-1)$ Case.'' Statistica Neerlandica 55: 134-156.