This function generates a sample from the posterior distribution of the measurement model of political support. Individual-level covariates may be included in the model. The details of the model are given under `Details'. See also Bullock et al. (2011).
endorse(Y, data, data.village = NA, village = NA, treat = NA,
na.strings = 99, identical.lambda = TRUE,
covariates = FALSE, formula.indiv = NA,
hierarchical = FALSE, formula.village = NA, h = NULL,
group = NULL, x.start = 0, s.start = 0,
beta.start = 1, tau.start = NA, lambda.start = 0,
omega2.start = .1, theta.start = 0, phi2.start = .1,
kappa.start = 0, psi2.start = 1, delta.start = 0,
zeta.start = 0, rho2.start = 1, mu.beta = 0, mu.x = 0,
mu.theta = 0, mu.kappa = 0, mu.delta = 0, mu.zeta = 0,
precision.beta = 0.04, precision.x = 1,
precision.theta = 0.04, precision.kappa = 0.04,
precision.delta = 0.04, precision.zeta = 0.04,
s0.omega2= 1, nu0.omega2 = 10, s0.phi2 = 1,
nu0.phi2 = 10, s0.psi2 = 1, nu0.psi2 = 10,
s0.sig2 = 1, nu0.sig2 = 400, s0.rho2 = 1,
nu0.rho2 = 10, MCMC = 20000, burn = 1000, thin = 1,
mh = TRUE, prop = 0.001, x.sd = TRUE,
tau.out = FALSE, s.out = FALSE, omega2.out = TRUE,
phi2.out = TRUE, psi2.out = TRUE, verbose = TRUE,
seed.store = FALSE, update = FALSE,
update.start = NULL)
a list of the variable names for the responses. It should take the following form:
list(Q1 = c("varnameQ1.1", "varnameQ1.2", ...),...)
.
If treat
is NA
, the first
variable for each question should be the responses of the
control observations while each of the other variables should
correspond to each endorser. treat
should be supplied if
only one variable name is provided for a question in this argument.
If auxiliary information is included, it is assumed that Y
is
coded such that higher values indicate more of the sensitive trait.
data frame containing the individual-level variables.
The cases must be complete, i.e., no NA
's are allowed.
data frame containing the village-level variables.
The cases must be complete, i.e., no NA
's are allowed. If
auxiliary information is included, the data frame should include only
the unique group identifier and the unique identifier for the units at
which prediction is desired. The package does not currently support
the inclusion of covariates in models with auxiliary information.
character. The variable name of the village indicator in the individual-level data. If auxiliary information is included, this should correspond to the variable name of the units at which prediction is desired.
An optional matrix of non negative integers indicating
the treatment
status of each observation and each question.
Rows are observations
and columns are questions. 0 represents the control status while
positive integers indicate treatment statuses.
If treat
is set to NA
, the function generates the
treatment matrix using Y
.
The default is NA
.
a scalar or a vector indicating the values of the
response variable that are to be interpreted as ``Don't Know'' or
``Refused to Answer.'' The value should not be NA
unless
treat
is provided, because NA
's are interpreted as the
response to the question with another endorsement. Default is
99
.
logical. If TRUE
, the model with a common
lambda across questions will be fitted. The default is TRUE
.
logical. If TRUE
, the model includes
individual-level covariates. The default is FALSE
.
a symbolic description specifying the individual level
covariates for the support parameter and the ideal points. The formula
should be one-sided, e.g. ~ Z1 + Z2
.
logical. IF TRUE
, the hierarchical model with
village level predictors will be fitted. The default is FALSE
.
a symbolic description specifying the village level covariates for the support parameter and the ideal points. The formula should be one-sided.
Auxiliary data functionality. Optional named numeric vector with length equal to number of groups. Names correspond to group labels and values correspond to auxiliary moments (i.e. to the known share of the sensitive trait at the group level).
Auxiliary data functionality. Optional character string.
The variable name of the group indicator in the individual-level data
(e.g. group = "county"
).
starting values for the ideal points vector \(x\). If
x.start
is set to a scalar, the starting values for the ideal
points of all respondents will be set to the scalar. If
x.start
is a vector of the same length as the number of
observations, then this vector will be used as the starting
values. The default is 0
.
starting values for the support parameter, \(s_ijk\).
If s.start
is set to a scalar, the starting values for the
support parameter of all respondents and all questions will be the
scalar.
If s.start
is set to a matrix, it should have the same number
of rows as the number of observations and the same number of columns
as the number of questions. Also, the value should be zero for the
control condition.
The default is 0
.
starting values for the question related parameters,
\(\alpha_j\) and \(\beta_j\).
If beta.start
is set to a scalar, the starting values for the
support parameter of all respondents and all questions will be the
scalar.
If beta.start
is set to a matrix, the number
of rows should be the number of questions and the number of columns
should be 2.
The first column
will be the starting values for \(\alpha_j\) and the second column
will be the starting values for \(\beta_j\).
Since the parameter values are constrained to be positive, the starting
values should be also positive.
The default is 1
.
starting values for the cut points in the response
model. If NA
, the function generates the starting values
so that each interval between the cut points is 0.5.
If tau.start
is set to a matrix, the number of rows should be
the same as the number of questions and the number of columns should
be the maximum value of the number of categories in the responses.
The first cut point for each question should be set to 0 while the
last one set to the previous cut point plus 1000.
The default is NA
.
starting values for the coefficients in the
support parameter model, \(\lambda_jk\).
If lambda.start
is set to a scalar, the starting values for
all coefficients will be the scalar.
If lambda.start
is set to a matrix, the number of rows should
be the number of the individual level covariates (plus the number of villages, if the model
is hierarchical),
and the number of columns should be the number of
endorsers (times the number of questions, if the model is with varying lambdas).
The default is 0
.
starting values for the variance of the support
parameters, \(\omega_{jk}^{2}\).
If set to a scalar, the starting values for
\(omega_{jk}^{2}\) will be the diagonal matrix with
the diagonal elements set to the scalar.
If omega2.start
is set to a matrix, the number of rows should
be the number of questions,
while the number of columns should be the same as the number of
endorsers.
The default is .1
.
starting values for the means of the
\(\lambda_{jk}\) for each endorser.
If theta.start
is set to a scalar, the starting values for
all parameters will be the scalar.
If theta.start
is set to a matrix, the number of rows should
be the number of endorsers and the number of columns should be the
dimension of covariates.
The default is 0
.
starting values for the covariance matrices of the
coefficients
of the support parameters, \(\Phi_{k}\).
\(\Phi_{k}\) is assumed to be a diagonal matrix.
If phi2.start
is set to a scalar, the starting values for
all covariance matrices will be the same diagonal matrix with the
diagonal elements set to the scalar.
If phi2.start
is set to a vector, the length should be the
number of endorsers times the dimension of covariates.
The default is .1
.
starting values for the coefficients on village level covariates in the
support parameter model, \(\kappa_k\).
If kappa.start
is set to a scalar, the starting values for
all coefficients will be the scalar.
If kappa.start
is set to a matrix, the number of rows should
be the number of the village level covariates,
and the number of columns should be the number of
endorsers (times the number of questions, if the varying-lambda
model is fitted).
The default is 0
.
starting values for the variance of the village random intercepts in the support
parameter model, \(\psi_{k}^{2}\).
If psi2.start
is set to a scalar, the starting values for
\(\psi_{k}^{2}\) will be the diagonal matrix with
the diagonal elements set to the scalar.
If psi2.start
is set to a vector, its length should
be the number of endorsers (times the number of questions, if the
varying-lambda model is fitted).
The default is .1
.
starting values for the coefficients on individual level covariates in the ideal
point model. Will be used only if covariates = TRUE
.
If delta.start
is set to a scalar, the starting values for
all coefficients will be the scalar.
If delta.start
is set to a vector, the length should be the
dimension of covariates.
The default is 0
.
starting values for the coefficients on village level covariates in the ideal
point model. Will be used only if covariates = TRUE
.
If zeta.start
is set to a scalar, the starting values for
all coefficients will be the scalar.
If zeta.start
is set to a vector, the length should be the
dimension of covariates.
The default is 0
.
numeric. starting values for the variance of the village random intercepts in the ideal point
model, \(\rho^{2}\). The default is 1
.
the mean of the independent Normal prior on the
question related parameters. Can be either a scalar or a matrix of
dimension the number of questions times 2.
The default is 0
.
the mean of the independent Normal prior on the
question related parameters. Can be either a scalar or a vector of
the same length as the number of observations.
The default is 0
.
the mean of the independent Normal prior on the
mean of the coefficients in the support parameter model.
Can be either a scalar or a vector of
the same length as the dimension of covariates.
The default is 0
.
the mean of the independent Normal prior on the
coefficients of village level covariates. Can be either a scalar or a matrix of
dimension the number of covariates times the number of endorsers.
If auxiliary information is included, the value of mu.kappa
will be computed for each group such that the prior probability of
the support parameter taking a positive value is equal to the known
value of h
.
The default is 0
.
the mean of the independent Normal prior on the
the coefficients in the ideal point model.
Can be either a scalar or a vector of
the same length as the dimension of covariates.
The default is 0
.
the mean of the independent Normal prior on the
the coefficients of village level covariates in the ideal point model.
Can be either a scalar or a vector of
the same length as the dimension of covariates.
The default is 0
.
the precisions (inverse variances) of the
independent Normal prior on the
question related parameters. Can be either a scalar or
a 2 \(\times\) 2 diagonal matrix.
The default is 0.04
.
scalar. The known precision of the
independent Normal distribution on the
ideal points.
The default is 1
.
the precisions of the
independent Normal prior on the means of the coefficients
in the support parameter model. Can be either a scalar or
a vector of the same length as the dimension of covariates.
The default is 0.04
.
the precisions of the
independent Normal prior on the coefficients of village level covariates
in the support parameter model. Can be either a scalar or
a vector of the same length as the dimension of covariates.
If auxiliary information is included, the value of precision.kappa
will be fixed to 100000
.
The default is 0.04
.
the precisions of the
independent Normal prior on the the coefficients
in the ideal point model. Can be either a scalar or
a square matrix of the same dimension as the dimension of
covariates.
The default is 0.04
.
the precisions of the
independent Normal prior on the the coefficients of village level covariates
in the ideal point model. Can be either a scalar or
a square matrix of the same dimension as the dimension of
covariates.
The default is 0.04
.
scalar. The scale of the independent scaled
inverse- chi-squared
prior for the variance parameter in the support parameter model.
If auxiliary information is included, the value of s0.omega2
will be fixed to the default.
The default is 1
.
scalar. The degrees of freedom of the independent
scaled inverse-chi-squared
prior for the variance parameter in the support parameter model.
If auxiliary information is included, the value of nu0.omega2
will be fixed to the default.
The default is 10
.
scalar. The scale of the independent
scaled inverse-chi-squared
prior for the variances of the coefficients in
the support parameter model.
The default is 1
.
scalar. The degrees of freedom of the independent
scaled
inverse-chi-squared
prior for the variances of the coefficients in
the support parameter model.
The default is 10
.
scalar. The scale of the independent
scaled inverse-chi-squared
prior for the variances of the village random intercepts in
the support parameter model.
The default is 1
.
scalar. The degrees of freedom of the independent
scaled
inverse-chi-squared
prior for the variances of the village random intercepts in
the support parameter model.
The default is 10
.
scalar. The scale of the independent
scaled inverse-chi-squared
prior for the variance parameter in
the ideal point model.
The default is 1
.
scalar. The degrees of freedom of the independent
scaled
inverse-chi-squared
prior for the variance parameter in the ideal point model.
The default is 400
.
scalar. The scale of the independent
scaled inverse-chi-squared
prior for the variances of the village random intercepts in
the ideal point model.
The default is 1
.
scalar. The degrees of freedom of the independent
scaled
inverse-chi-squared
prior for the variances of the village random intercepts in
the ideal point model.
The default is 10
.
the number of iterations for the sampler. The default is
20000
.
the number of burn-in iterations for the sampler. The
default is 1000
.
the thinning interval used in the simulation. The default
is 1
.
logical. If TRUE
, the Metropolis-Hastings algorithm
is used to sample the cut points in the response model. The default is
TRUE
.
a positive number or a vector consisting of positive
numbers. The length of the vector should be the same as the number of
questions. This argument sets proposal variance for the
Metropolis-Hastings algorithm in sampling the cut points of the
response model. The default is 0.001
.
logical. If TRUE
, the standard deviation of the
ideal points in each draw will be stored. If FALSE
, a sample of
the ideal points will be stored. NOTE: Because storing a sample
takes an enormous amount of memory, this option should be
selected only if the chain is thinned heavily or the data have a small
number of observations.
logical. A switch that determines whether or not to
store the cut points in the response model. The default is
FALSE
.
logical. If TRUE
, the support parameter for each
respondent and each question will be stored.
The default is FALSE
. NOTE: Because storing a sample
takes an enormous amount of memory, this option should be
selected only if the chain is thinned heavily or the data have a small
number of observations.
logical. If TRUE
, the variannce parameter of the support
parameter model will be stored.
The default is TRUE
.
logical. If TRUE
, the variannce parameter of the model
for the coefficients in the support parameter model will be stored.
The default is TRUE
.
logical. If TRUE
, the variance of the village random intercepts
in the support parameter model will be stored.
The default is TRUE
.
logical. A switch that determines whether or not to
print the progress of the chain and Metropolis acceptance ratios for
the cut points of the response model. The default is
TRUE
.
logical. If TRUE
, the seed will be stored in order
to update the chain later. The default is FALSE
.
logical. If TURE
, the function is run to update a chain.
The default is FALSE
.
list. If the function is run to update a chain, the output
object of the previous run should be supplied. The default is NULL
.
An object of class "endorse"
, which is a list containing the following
elements:
an "mcmc"
object. A sample from the posterior distribution
of \(\alpha\) and \(\beta\).
If x.sd = TRUE
, a vector of the standard deviation of
the ideal points in each draw. If x.sd = FALSE
, an mcmc
object that contains a sample from the posterior distribution of the
ideal points.
If s.out = TRUE
, an mcmc object that contains a sample
from the posterior distribution of \(s_{ijk}\). Variable
names are:
s(observation id)(question id)
.
If covariates = TRUE
, an mcmc object that contains
a sample from the posterior distribution of \(\delta\).
If tau.out = TRUE
, an mcmc object that contains a
sample from the posterior distribution of \(\tau\).
an mcmc object. A sample from the posterior distribution
of \(\lambda\). Variable names are:
lambda(question id)(group id).(covariate id)
.
an mcmc object. A sample from the posterior distribution of \(\theta\).
an mcmc object.
an mcmc object.
Note that the posterior sample of all parameters are NOT standardized. In making posterior inference, each parameter should be divided by the standard deviation of x (in the default setting, it is given as "x") or by \sigma^2sigma (in the default setting, it is given as "sigma2").
Also note that \alphaalpha and the intercept in \deltadelta (or, if the model is hierarchical, the intercept in \zetazeta) are not identified. Instead, - \alpha + \beta * \delta_0 - alpha + beta * delta_0 or, if the model is hierarchical, - \alpha + \beta * \zeta_0 - alpha + beta * zeta_0 is identified after either of the above standardization, where \delta_0delta_0 and \zeta_0zeta_0 denote the intercepts. When using the auxiliary data functionality, the following objects are included:
logical value indicating whether estimation incorporates auxiliary moments
integer count of the number of auxiliary moments
The model takes the following form:
Consider an endorsement experiment where we wish to measure the level of support for \(K\) political actors. In the survey, respondents are asked whether or not they support each of \(J\) policies chosen by researchers. Let \(Y_{ij}\) represent respondent \(i\)'s answer to the survey question regarding policy \(j\). Suppose that the response variable \(Y_{ij}\) is the ordered factor variable taking one of \(L_{j}\) levels, i.e., \(Y_{ij} \in \{0, 1, \dots, L_{j} - 1\}\) where \(L_{j} > 1\). We assume that a greater value of \(Y_{ij}\) indicates a greater level of support for policy \(j\). We denote an \(M\) dimensional vector of the observed characteristics of respondent \(i\) by \(Z_i\).
In the experiment, we randomly assign one of \(K\) political actors as an endorser to respondent \(i\)'s question regarding policy \(j\) and denote this treatment variable by \(T_{ij} \in \{0,1,\dots,K\}\). We use \(T_{ij}=0\) to represent the control observations where no political endorsement is attached to the question. Alternatively, one may use the endorsement by a neutral actor as the control group.
The model for the response variable, \(Y_{ij}\), is given by, $$Y_{ij} = l \; {\rm if} \; \tau_{l} < Y^{*}_{ij} \le \tau_{l + 1}, $$ $$Y^{*}_{ij} \; | \; T_{ij} = k \sim \mathcal{N}(- \alpha_{j} + \beta_{j} (x_{i} + s_{ijk}), \; I) $$ where \(l \in \{0, 1, \dots, L_{j} \}, \tau_{0} = -\infty < \tau_{1} = 0 < \tau_{2} < \dots < \tau_{L_{j}} = \infty\). \(\beta_j\)'s are assumed to be positive.
The model for the support parameter, \(s_{ijk}\), is given by if \(T_{ij} \neq 0\), $$ s_{ijk} \sim \mathcal{N}(Z_i^{T} \lambda_{jk}, \; \omega_{jk}^2) $$ with covariates, and $$ s_{ijk} \sim \mathcal{N}(\lambda_{jk}, \; \omega_{jk}^2), $$ without covariates, for \(j = 1, \dots, J, \; k = 1, \dots, K\), and if \(T_{ij} = 0, \; s_{ijk} = 0\).
The \(\lambda\)'s in the support parameter model are modeled in the following hierarchical manner, $$ \lambda_{jk} \sim \mathcal{N}(\theta_k, \; \Phi_k) $$ for \(k = 1, \dots, K\).
If you set identical.lambda = FALSE
and hierarchical = TRUE
,
the model for \(s_{ijk}\) is if \(T_{ij} \neq 0\),
$$
s_{ijk} \sim \mathcal{N}(\lambda^{0}_{jk, village[i]} + Z_i^{T} \lambda_{jk}, \; \omega_{jk}^2)
$$
and
$$
\lambda^{0}_{jk, village[i]} \sim \mathcal{N}(V_{village[i]}^{T} \kappa_{jk}, \; \psi_{jk}^2)
$$
for \(k = 1, \dots, K\) and \(j = 1, \dots, J\). In addition,
\(\lambda\) and \(\kappa\) are modeled in the following
hierarchical manner,
$$
\lambda^{*}_{jk} \sim \mathcal{N}(\theta_k, \; \Phi_k)
$$
for \(k = 1, \dots, K\), where \(\lambda^{*}_{jk} =
(\lambda^{T}_{jk}, \kappa^{T}_{jk})^{T}\).
If you set identical.lambda = TRUE
and hierarchical = TRUE
,
the model for \(s_{ijk}\) is if \(T_{ij} \neq 0\),
$$
s_{ijk} \sim \mathcal{N}(\lambda^{0}_{k, village[i]} + Z_i^{T} \lambda_{k}, \; \omega_{k}^2)
$$
and
$$
\lambda^{0}_{k, village[i]} \sim \mathcal{N}(V_{village[i]}^{T} \kappa_{k}, \; \psi_{k}^2)
$$
for \(k = 1, \dots, K\).
If the covariates are included in the model, the model for the ideal points is given by $$ x_{i} \sim \mathcal{N}(Z_{i}^{T} \delta, \; \sigma_{x}^{2}) $$ for \(i = 1, \dots, N\) where \(\sigma_x^2\) is a known prior variance.
If you set hierarchical = TRUE
,
the model is
$$
x_{i} \sim \mathcal{N}(\delta^{0}_{village[i]} + Z_i^{T} \delta, \; \sigma^2)
$$
and
$$
\delta^{0}_{village[i]} \sim \mathcal{N}(V_{village[i]}^{T} \zeta, \; \rho^2)
$$
for \(k = 1, \dots, K\).
Finally, the following independent prior distributions are placed on unknown parameters, $$ \alpha_j \sim \mathcal{N}(\mu_\alpha, \; \sigma_\alpha^2) $$ for \(j = 1, \dots, J\), $$ \beta_j \sim \mathcal{TN}_{\beta_j > 0}(\mu_\beta, \; \sigma_\beta^2) $$ for \(j = 1, \dots, J\), $$ \delta \sim \mathcal{N}(\mu_\delta, \; \Sigma_\delta), $$ $$ \theta_k \sim \mathcal{N}(\mu_\theta, \; \Sigma_\theta) $$ for \(k = 1, \dots, K\), $$ \omega_{jk}^2 \sim {\rm Inv-}\chi^{2}(\nu_{\omega}^0, \; s_{\omega}^0) $$ for \(j = 1, \dots, J\) and \(k = 1, \dots, K\), and $$ {\rm diag}(\Phi_k) \sim {\rm Inv-}\chi^{2}(\nu_{\Phi}^0, \; s_{\Phi}^0) $$ for \(k = 1, \dots, K\), where \(\Phi_k\) is assumed to be a diagonal matrix.
Bullock, Will, Kosuke Imai, and Jacob N. Shapiro. (2011) “Statistical Analysis of Endorsement Experiments: Measuring Support for Militant Groups in Pakistan,” Political Analysis, Vol. 19, No. 4 (Autumn), pp.363-384.
# NOT RUN {
data(pakistan)
Y <- list(Q1 = c("Polio.a", "Polio.b", "Polio.c", "Polio.d", "Polio.e"),
Q2 = c("FCR.a", "FCR.b", "FCR.c", "FCR.d", "FCR.e"),
Q3 = c("Durand.a", "Durand.b", "Durand.c", "Durand.d",
"Durand.e"),
Q4 = c("Curriculum.a", "Curriculum.b", "Curriculum.c",
"Curriculum.d", "Curriculum.e"))
## Varying-lambda non-hierarchical model without covariates
endorse.out <- endorse(Y = Y, data = pakistan, identical.lambda = FALSE,
covariates = FALSE, hierarchical = FALSE)
## Varying-lambda non-hierarchical model with covariates
indiv.covariates <- formula( ~ female + rural)
endorse.out <- endorse(Y = Y, data = pakistan, identical.lambda = FALSE,
covariates = TRUE,
formula.indiv = indiv.covariates,
hierarchical = FALSE)
## Common-lambda non-hierarchical model with covariates
indiv.covariates <- formula( ~ female + rural)
endorse.out <- endorse(Y = Y, data = pakistan, identical.lambda = TRUE,
covariates = TRUE,
formula.indiv = indiv.covariates,
hierarchical = FALSE)
## Varying-lambda hierarchical model without covariates
div.data <- data.frame(division = sort(unique(pakistan$division)))
div.formula <- formula(~ 1)
endorse.out <- endorse(Y = Y, data = pakistan, data.village = div.data,
village = "division", identical.lambda = FALSE,
covariates = FALSE, hierarchical = TRUE,
formula.village = div.formula)
## Varying-lambda hierarchical model with covariates
endorse.out <- endorse(Y = Y, data = pakistan, data.village = div.data,
village = "division", identical.lambda = FALSE,
covariates = TRUE,
formula.indiv = indiv.covariates,
hierarchical = TRUE,
formula.village = div.formula)
## Common-lambda hierarchical model without covariates
endorse.out <- endorse(Y = Y, data = pakistan, data.village = div.data,
village = "division", identical.lambda = TRUE,
covariates = FALSE, hierarchical = TRUE,
formula.village = div.formula)
## Common-lambda hierarchical model with covariates
endorse.out <- endorse(Y = Y, data = pakistan, data.village = div.data,
village = "division", identical.lambda = TRUE,
covariates = TRUE,
formula.indiv = indiv.covariates,
hierarchical = TRUE,
formula.village = div.formula)
# }
Run the code above in your browser using DataLab