epi.clustersize: Sample size for cluster-sample surveys

Description

Estimates the number of clusters to be sampled using a cluster-sample design.

Usage

epi.clustersize(p, b, rho, epsilon.r, conf.level = 0.95)

Arguments

the estimated prevalence of the outcome in the population.

the number of units sampled per cluster.

rho

the intra-cluster correlation, a measure of the variation between clusters compared with the variation within clusters.

epsilon.r

scalar, the acceptable relative error.

conf.level

scalar, defining the level of confidence in the computed result.

Value

A list containing the following:

clusters

the estimated number of clusters to be sampled.

units

the total number of units to be sampled.

design

the design effect.

References

Bennett S, Woods T, Liyanage WM, Smith DL (1991). A simplified general method for cluster-sample surveys of health in developing countries. World Health Statistics Quarterly 44: 98 - 106.

Otte J, Gumm I (1997). Intra-cluster correlation coefficients of 20 infections calculated from the results of cluster-sample surveys. Preventive Veterinary Medicine 31: 147 - 150.

Examples

Run this code

## EXAMPLE 1:
## The expected prevalence of disease in a population of cattle is 0.10.
## We wish to conduct a survey, sampling 50 animals per farm. No data  
## are available to provide an estimate of rho, though we suspect
## the intra-cluster correlation for this disease to be moderate.           
## We wish to be 95% certain of being within 10% of the true population
## prevalence of disease. How many herds should be sampled?

p <- 0.10; b <- 50; D <- 4
rho <- (D - 1) / (b - 1)
epi.clustersize(p = 0.10, b = 50, rho = rho, epsilon.r = 0.10, 
   conf.level = 0.95)

## We need to sample 278 herds (13900 samples in total).

## EXAMPLE 2 (from Bennett et al. 1991):
## A cross-sectional study is to be carried out to determine the prevalence
## of a given disease in a population using a two-stage cluster design. We 
## estimate prevalence to be 0.20 and we expect rho to be in the order of 0.02.
## We want to take sufficient samples to be 95% certain that our estimate of 
## prevalence is within 5% of the true population value (that is, a relative 
## error of 0.05 / 0.20 = 0.25). Assuming 20 responses from each cluster, 
## how many clusters do we need to be sample?

epi.clustersize(p = 0.20, b = 20, rho = 0.02, epsilon.r = 0.25, 
   conf.level = 0.95)

## We need to sample 18 clusters (360 samples in total).

Run the code above in your browser using DataLab