epi.empbayes: Empirical Bayes estimates of observed event counts

Description

Computes empirical Bayes estimates of observed event counts using the method of moments.

Usage

epi.empbayes(obs, pop)

Value

A data frame with four elements: gamma the mean event risk across all units, phi the variance of event risk across all units, alpha the estimated shape parameter of the gamma distribution, and nu the estimated scale parameter of the gamma distribution.

Arguments

obs: a vector representing the observed event counts in each unit of interest.
pop: a vector representing the population count in each unit of interest.

Details

The gamma distribution is parameterised in terms of shape (\(\alpha\)) and scale (\(\nu\)) parameters. The mean of a given gamma distribution equals \(\nu / \alpha\). The variance equals \(\nu / \alpha^{2}\). The empirical Bayes estimate of event risk in each unit of interest equals \((obs + \nu) / (pop + \alpha)\).

This technique performs poorly when your data contains large numbers of zero event counts. In this situation a Bayesian approach for estimating \(\alpha\) and \(\nu\) would be advised.

References

Bailey TC, Gatrell AC (1995). Interactive Spatial Data Analysis. Longman Scientific & Technical. London, pp. 303 - 308.

Langford IH (1994). Using empirical Bayes estimates in the geographical analysis of disease risk. Area 26: 142 - 149.

Meza J (2003). Empirical Bayes estimation smoothing of relative risks in disease mapping. Journal of Statistical Planning and Inference 112: 43 - 62.

Examples

Run this code

## EXAMPLE 1:
data(epi.SClip)
obs <- epi.SClip$cases; pop <- epi.SClip$population

est <- epi.empbayes(obs, pop)
crude.p <- ((obs) / (pop)) * 100000
crude.r <- rank(crude.p)
ebay.p <- ((obs + est[4]) / (pop + est[3])) * 100000

dat.df01 <- data.frame(rank = c(crude.r, crude.r), 
   Method = c(rep("Crude", times = length(crude.r)), 
      rep("Empirical Bayes", times = length(crude.r))),
   est = c(crude.p, ebay.p)) 

## Scatter plot showing the crude and empirical Bayes adjusted lip cancer 
## incidence rates as a function of district rank for the crude lip 
## cancer incidence rates: 
                          
if (FALSE) {
library(ggplot2)

ggplot(dat = dat.df01, aes(x = rank, y = est, colour = Method)) +
  theme_bw() +
  geom_point() +
  scale_x_continuous(name = "District rank", 
     breaks = seq(from = 0, to = 60, by = 10), 
     labels = seq(from = 0, to = 60, by = 10), 
     limits = c(0,60)) +
  scale_y_continuous(limits = c(0,30), name = "Lip cancer incidence rates 
     (cases per 100,000 person years)") 
}

Run the code above in your browser using DataLab