This function tunes the markov chain monte carlo
algorithm used to fit a hierarchical model to ecological data in
which the underlying contigency tables can have any number of rows
or columns. The user supplies the data and may specify hyperprior
values. The function's primary output is a vector of multipliers,
called rhos
, used to adjust the covariance matrix of the
multivariate \(t_4\) distribution used to propose new values
of intermediate-level parameters (denoted THETAS).
Tune(fstring, data=NULL, num.runs=12, num.iters=10000,
rho.vec=rep(0.05, ntables),
kappa=10, nu=(mu.dim+6), psi=mu.dim,
mu.vec.0=rep(log((.45/(mu.dim-1))/.55), mu.dim),
mu.vec.cu=runif(mu.dim, -3, 0),
nolocalmode=50, sr.probs=NULL, sr.reps=NULL,
numscans=1, Diri=100, dof=4, debug=1)
String: model formula of contingency tables' column totals versus row totals. Must be in specified format (an R character string and NOT a true R formula). See Details and Examples.
Data frame.
Positive integer: The number of runs or times (each of
num.iters
iterations) the tuning algorthm will be
implemented.
Positive integer: The number of iterations in each run of the tuning algorithm.
Vector of dimension \(I\) = number of contigency
tables = number of rows in data
: initial values of multipliers (usually in
(0,1)) to the covariance matrix of the proposal distribution for
the draws of the intermediate level parameters. The purpose of this
Tune
function is to adjust these values so as to achieve
acceptance ratios of between .2 and .5 in the MCMC draws of the
THETA
s.
Scalar: The diagonal of the covariance matrix for the (normal) hyperprior distribution for the \(\mu\) parameter.
Scalar: The degrees of freedom for the (Inverse-Wishart)
hyperprior distriution for the SIGMA
parameter.
Scalar: The diagonal of the matrix parameter of the
(Inverse-Wishart) hyperprior distribution for the SIGMA
parameter.
Vector: mean of the (normal) hyperprior distribution for the \(\mu\) parameter.
Vector of dimension \(R*(C-1)\), where \(R\)(\(C\)) is the number of rows(columns) in each contigency table: Optional starting values for \(\mu\) parameter.
Positive integer: How often an alternative drawing method for the contigency table internal cell counts will be used. Use of default value recommended.
Matrix of dimension \(I\) x \(R\): Each value
represents the probability of selecting a particular
contingency table's row as the row to be calculated deterministically
in (product multinomial) proposals for Metropolis draws of the
internal cell counts. For example, if R = 3 and row 2 of position
sr.probs
= c(.1, .5, .4), then in the third contingency table
(correspoding to the third row of data
), the proposal
algorithm for the interior cell counts will calculate the third
contingency table's first row deterministically with probability
.1, the second row with probability .5, and the third row with
probability .4. Use of default (generated
internally) recommended.
Matrix of dimension \(I\) x \(R\): Each value represents the number of times the (product multinomial proposal) Metropolis algorithm will be attempted when, in drawing the internal cell counts, the proposal for the corresponding contingency table row is to be calculated deterministically. sr.reps has the same structure as sr.probs, i.e., position [3,1] of sr.reps corresponds to the third contingency table's first row. Use of default (generated internally) recommended.
Positive integer: How often the algorithm to draw the contingency table internal cell counts will be implemented before new values of the other parameters are drawn. Use of default value recommended.
Positive integer: How often a product Dirichlet proposal distribution will be used to draw the contingency table row probability vectors (the THETAS).
Positive integer: The degrees of freedom of the multivariate \(t\) proposal distribution used in drawing the contingency table row probability vectors (the THETAS).
Integer: Akin to verbose
in some packages. If set
to 1, certain status information (including rough notification
regarding the number of iterations completed) will be
written to the screen.
A list with the following elements.
A vector of length I
= number of contingency tables: each
element of the rhos
vector is a multiplier used in the proposal
distribution of for draws from the conditional posterior of the THETAs,
as described above. Feed this vector into the Analyze
function.
Matrix of dimension I
x num.runs
: Each
column of acc.t
contains the acceptance fractions for the
Metropolis-Hastings algorithm, with a multivariate \(t_4\)
proposal distribution, used to draw from the conditional posterior
of the THETA
s. If Tune
has worked properly, all
elements of the final column of this matrix should be between .2
and .5.
Matrix of dimension I
x num.runs
: Each column of
acc.t
contains the acceptance fractions for the Metropolis-Hastings
algorithm, with independent Dirichlet proposals, used to draw from the conditional posterior of the
THETA
s. Tune
does not alter this algorithm.
A list of length num.runs
: Each element of
vld.NNs
is a matrix of dimension I
by R
, with
each element of the list corresponding to one of the
num.iters
sets of iterations run by Tune
. To draw
from the conditional posterior of the internal cell counts of a
contigency table, the Tune
function draws R-1 vectors of
lenth C from multinomial distributions. In then calculates the
counts in the additional row (denote this row as r')
deterministically. This procedure can result in negative values in
row r', in which case the overall proposal for the interior cell
counts is outside the parameter space (and thus invalid). Each
matrix of vld.NNs keeps track of the percentage of proposals drawn
in this manner that are valid (i.e., not invalid). Each row
of such a matrix corresponds to a contingency table. Each column
in the matrix corresponds to a row in the a contingency
table. Each entry specifies the percentage of multinomial
proposals that are valid when the specified contingency table row
serves as the r' row. For instance, in position 5,2 of vld.NNs is
the fraction of valid proposals for the 5th contingency table when
the second contigency table row is the r'th row. A value of
``NaN'' means that Tune
chose to use a different (slower)
method of drawing the internal cell counts because it suspected
that the multinomial method would behave badly.
A list of length num.runs
: Same as vld.NNs,
except the entries represent the fraction of proposals accepted
(instead of the fraction that are in the permissible parameter
space).
Tune is a necessary precursor function to Analyze
, the workhorse
function in fitting the R x C
ecological inference model described in Greiner & Quinn (2009). The
details of this model are discussed in the documentation accompanying
Analyze
.
One of the stages of the Gibbs sampler used to fit the Greiner & Quinn ecological inference model involves sampling from the conditional posterior distribution of the vector of probabilities associated with each contingency table (precinct, in voting applications). There are \(R\) separate sets of probabilities (each of which must sum to one) associated with each contingency table. Each such \(theta_r\) undergoes a multidimensional logistic transformation, using the last (right-most) column as the reference category. This results in \(R\) transformed vectors of dimension \((C-1)\); the transformed vectors, denoted \(\omega_rs\), are stacked to form a single \(\omega\) vector corresponding to that contingency table. The omega vectors are assumed to follow (i.i.d.) a multivariate normal distribution.
The posterior distribution of the THETAs/OMEGAs are in non-standard
form. To sample from the posterior, the algorithm uses a
Metropolis-Hastings step with a multivariate \(t_4\) proposal
distribution. The covariance matrix of this multivariate
\(t_4\) must be expanded or shrunk to achieve acceptance
ratios of between .2 and .5. Tune implements num.runs
sets of
num.iters
iterations of the Gibbs sampler. At the end of each
set of iterations, Tune examines the acceptance ratios in each
precinct and adjusts a shrinkage factor (a scalar multiplied to the
covariance matrix of the \(t_4\) proposal) upwards or downwards. When
finished, Tune returns a vector of length I
= the number of
contingency tables in data
, This vector, called rhos
,
should be fed into the Analyze
function. See Examples here
and accompanying Analze
.
D. James Greiner \& Kevin M. Quinn. 2009. ``R x C Ecological Inference: Bounds, Correlations, Flexibility, and Transparency of Assumptions.'' J.R. Statist. Soc. A 172:67-81.
# NOT RUN {
library(RxCEcolInf)
data(stlouis)
Tune.stlouis <- Tune("Bosley, Roberts, Ribaudo, Villa, NoVote ~ bvap, ovap",
data = stlouis,
num.iters = 10000,
num.runs = 15)
# }
Run the code above in your browser using DataLab