ciNormN: Sample Size for Specified Half-Width of Confidence Interval for Normal Distribution Mean or Difference Between Two Means

Description

Compute the sample size necessary to achieve a specified half-width of a confidence interval for the mean of a normal distribution or the difference between two means, given the estimated standard deviation and confidence level.

Usage

ciNormN(half.width, sigma.hat = 1, conf.level = 0.95, 
    sample.type = ifelse(is.null(n2), "one.sample", "two.sample"), 
    n2 = NULL, round.up = TRUE, n.max = 5000, tol = 1e-07, maxiter = 1000)

Value

When sample.type="one.sample", or sample.type="two.sample" and n2

is not supplied (so equal sample sizes for each group is assumed), the function ciNormN returns a numeric vector of sample sizes. When sample.type="two.sample" and n2 is supplied, the function ciNormN returns a list with two components called n1 and n2, specifying the sample sizes for each group.

Arguments

half.width: numeric vector of (positive) half-widths. Missing (NA), undefined (NaN), and infinite (Inf, -Inf) values are not allowed.
sigma.hat: numeric vector specifying the value(s) of the estimated standard deviation(s).
conf.level: numeric vector of numbers between 0 and 1 indicating the confidence level associated with the confidence interval(s). The default value is conf.level=0.95.
sample.type: character string indicating whether this is a one-sample
(sample.type="one.sample") or two-sample
(sample.type="two.sample") confidence interval.
When sample.type="one.sample", the computed sample size is based on a confidence interval for a single mean.
When sample.type="two.sample", the computed sample size is based on a confidence interval for the difference between two means.
The default value is sample.type="one.sample" unless the argument n2 is supplied.
n2: numeric vector of sample sizes for group 2. The default value is NULL, in which case it is assumed that the sample sizes for groups 1 and 2 are equal. This argument is ignored when sample.type="one.sample". Missing (NA), undefined (NaN), and infinite (Inf, -Inf) values are not allowed.
round.up: logical scalar indicating whether to round up the values of the computed sample size(s) to the next smallest integer. The default value is round.up=TRUE.
n.max: positive integer greater than 1 specifying the maximum sample size for the single group when sample.type="one.sample" or for group 1 when
sample.type="two.sample". The default value is n.max=5000.
tol: numeric scalar indicating the tolerance to use in the uniroot search algorithm. The default value is tol=1e-7.
maxiter: positive integer indicating the maximum number of iterations to use in the uniroot search algorithm. The default value is maxiter=1000.

Author

Steven P. Millard (EnvStats@ProbStatInfo.com)

Details

If the arguments half.width, n2, sigma.hat, and conf.level are not all the same length, they are replicated to be the same length as the length of the longest argument.

The function ciNormN uses the formulas given in the help file for ciNormHalfWidth for the half-width of the confidence interval to iteratively solve for the sample size. For the two-sample case, the default is to assume equal sample sizes for each group unless the argument n2 is supplied.

References

Berthouex, P.M., and L.C. Brown. (2002). Statistics for Environmental Engineers. Second Edition. Lewis Publishers, Boca Raton, FL.

Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, New York, NY.

Helsel, D.R., and R.M. Hirsch. (1992). Statistical Methods in Water Resources Research. Elsevier, New York, NY, Chapter 7.

Millard, S.P., and N. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton, FL.

Ott, W.R. (1995). Environmental Statistics and Data Analysis. Lewis Publishers, Boca Raton, FL.

USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R-09-007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C. p.21-3.

Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ, Chapters 7 and 8.

Examples

Run this code

  # Look at how the required sample size for a one-sample 
  # confidence interval decreases with increasing half-width:

  seq(0.25, 1, by = 0.25) 
  #[1] 0.25 0.50 0.75 1.00 

  ciNormN(half.width = seq(0.25, 1, by = 0.25)) 
  #[1] 64 18 10 7 

  ciNormN(seq(0.25, 1, by=0.25), round = FALSE) 
  #[1] 63.897899 17.832337  9.325967  6.352717

  #----------------------------------------------------------------

  # Look at how the required sample size for a one-sample 
  # confidence interval increases with increasing estimated 
  # standard deviation for a fixed half-width:

  seq(0.5, 2, by = 0.5) 
  #[1] 0.5 1.0 1.5 2.0 

  ciNormN(half.width = 0.5, sigma.hat = seq(0.5, 2, by = 0.5)) 
  #[1] 7 18 38 64

  #----------------------------------------------------------------

  # Look at how the required sample size for a one-sample 
  # confidence interval increases with increasing confidence 
  # level for a fixed half-width:

  seq(0.5, 0.9, by = 0.1) 
  #[1] 0.5 0.6 0.7 0.8 0.9 

  ciNormN(half.width = 0.25, conf.level = seq(0.5, 0.9, by = 0.1)) 
  #[1] 9 13 19 28 46

  #----------------------------------------------------------------

  # Modifying the example on pages 21-4 to 21-5 of USEPA (2009), 
  # determine the required sample size in order to achieve a 
  # half-width that is 10% of the observed mean (based on the first 
  # four months of observations) for the Aldicarb level at the first 
  # compliance well.  Assume a 95% confidence level and use the 
  # estimated standard deviation from the first four months of data. 
  # (The data are stored in EPA.09.Ex.21.1.aldicarb.df.) 
  #
  # The required sample size is 20, so almost two years of data are 
  # required assuming observations are taken once per month.

  EPA.09.Ex.21.1.aldicarb.df
  #   Month   Well Aldicarb.ppb
  #1      1 Well.1         19.9
  #2      2 Well.1         29.6
  #3      3 Well.1         18.7
  #4      4 Well.1         24.2
  #...

  mu.hat <- with(EPA.09.Ex.21.1.aldicarb.df, 
    mean(Aldicarb.ppb[Well=="Well.1"]))

  mu.hat 
  #[1] 23.1 

  sigma.hat <- with(EPA.09.Ex.21.1.aldicarb.df, 
    sd(Aldicarb.ppb[Well=="Well.1"]))

  sigma.hat 
  #[1] 4.93491 

  ciNormN(half.width = 0.1 * mu.hat, sigma.hat = sigma.hat) 
  #[1] 20

  #----------
  # Clean up
  rm(mu.hat, sigma.hat)

Run the code above in your browser using DataLab