EpiBayesHistorical: Historical Aggregation for Disease Model

Description

This function aggregates results concerning cluster-level prevalence from the disease model from function EpiBayes_ns across a given number of time periods. Generally, this is accomplished by starting out with a user-defined prior distribution on the cluster-level prevalence (tau), updating this prior using observed data, using this posterior distribution as the prior distribution in the second time period, and repeating this process for all time periods -- we hope to implement a way to incorporate a measurement of introduction risk of the disease in between time periods, but for now we assume that the disease retains its original properties among all time periods.

Usage

EpiBayesHistorical(input.df, orig.tauparm, MCMCreps, burnin = 1000, ...)

Arguments

input.df

Data frame of input values that must be supplied by the user, and will be passed to the function EpiBayes_ns. The matrix will have one row per cluster and will have five columns. The columns should have the order: Time Period, Subzone, Cluster Size, Season, and Positive Diagnostic Test Results (y). See Details for more information. Real matrix (sum(k) x 5).

orig.tauparm

The prior parameters for the beta-distributed cluster-level prevalence we assume to hold before our first time period. Real vector (2 x 1).

MCMCreps

Number of iterations in the MCMC chain per replicated data set. Integer scalar.

burnin

Number of MCMC iterations to discard from the beginning of the chain. Integer scalar.

...

Additional arguments that will be passed to EpiBayes_ns. Otherwise, the default values will be used.

Value

The returned values are given in a list. They are as follows.

Output	Attributes
Description	`RawPost`
List: Length - (number of periods), Elements - Real arrays (`reps` x `H` x `MCMCreps`)	Posterior distributions for the cluster-level prevalences for each subzone from all time periods
`BetaBusterEst`	List: Length - (number of periods), Elements - Real vectors (2 x 1)
Estimated posterior distributions for the cluster-level prevalences for each subzone from all time periods using moment-matching to the closest beta distribution by the function `epi.betabuster`	`ForOthers`
	Various other data not intended to be used by the user, but used to pass information on to the `plot`, `summary`, and `print` methods

Details

The input.df should have the following columns, in this order:

Time Period: vector of codes for unique time periods. For example, could be a vector of periods: c(2015, 2015, 2016, ...).
Subzone: vector of codes for unique subzones. Should be the same for all rows if using one- or two-level sampling. For example, could be a vector of names of a particular subzone: c("CO", "CO", "IN", ...).
Cluster Size: vector of integers denoting the number of subjects within that particular cluster. For example: c(100, 500, 250, ...).
Season: vector of codes for season in which observed data were collected. Must adhere to the requrirement that (1) denotes Summer, (2) Fall, (3) Winter, and (4) Spring. For example: c(1, 1, 4, ...).
Positive Diagnostic Test Results (y): vector of integers denoting the observed number of positive diagnostic test results within that particular cluster. For example: c(0, 4, 1, ...). Note: if so desired, the user may let the model generate sample data automatically when there is no concrete sample data with which to work.

Examples

Run this code

## Construct input data frame with columns Year, Subzone, Cluster size, Season, and Number positives
year = rep(c("Period 1", "Period 2", "Period 3"), c(60, 60, 60))
subz = rep(rep(c("Subzone 1", "Subzone 2"), c(25, 35)), 3)
size = rep(100, 3 * 60)
season = rep(rep(c(1,2), each = 30), 3)
y = matrix(c(
    rep(10, 15), rep(0, 10),  # Period 1: Subzone 1
    rep(0, 35),  # Period 1: Subzone 2
    rep(10, 15), rep(0, 10),  # Period 2: Subzone 1
    rep(10, 10), rep(0, 25),  # Period 2: Subzone 2
    rep(25, 25), # Period 3: Subzone 1
    rep(25, 10), rep(0, 25)  # Period 3: Subzone 2
    ),
    ncol = 1
)

testrun_historical_inputdf = data.frame(year, subz, size, season, y)

testrun_historical = EpiBayesHistorical(
		input.df = testrun_historical_inputdf,
		orig.tauparm = c(1, 1),
		burnin = 1,
		MCMCreps = 5,
		poi = "tau",
		mumodes = matrix(c(
			0.50, 0.70,
			0.50, 0.70,
			0.02, 0.50,
			0.02, 0.50
			), 4, 2, byrow = TRUE
		),
		pi.thresh = 0.05,
	    tau.thresh = 0.02,
     gam.thresh = 0.10,
		tau.T = 0,
		poi.lb = 0,
		poi.ub = 1,
		p1 = 0.95,
		psi = 4,
		omegaparm = c(1, 1),
		gamparm = c(1, 1),
		etaparm = c(10, 1),
		thetaparm = c(10, 1)
		)

testrun_historical
plot(testrun_historical)
testrun_historicalsummary = summary(testrun_historical, sumstat = "quantile",
    prob = 0.99, time.labels = c("Period 1", "Period 2", "Period 3"))
testrun_historicalsummary
plot(testrun_historicalsummary)