ciNormHalfWidth: Half-Width of Confidence Interval for Normal Distribution Mean or Difference Between Two Means

Description

Compute the half-width of a confidence interval for the mean of a normal distribution or the difference between two means, given the sample size(s), estimated standard deviation, and confidence level.

Usage

ciNormHalfWidth(n.or.n1, n2 = n.or.n1, 
    sigma.hat = 1, conf.level = 0.95, 
    sample.type = ifelse(missing(n2), "one.sample", "two.sample"))

Arguments

n.or.n1

numeric vector of sample sizes. When sample.type="one.sample", this argument denotes $n$, the number of observations in the single sample. When sample.type="two.sample", this argument denotes $n_1$, the number of observations from group 1. Missing (NA), undefined (NaN), and infinite (Inf, -Inf) values are not allowed.

numeric vector of sample sizes for group 2. The default value is the value of n.or.n1. This argument is ignored when sample.type="one.sample". Missing (NA), undefined (NaN), and infinite (Inf, -Inf) values are not allowed.

sigma.hat

numeric vector specifying the value(s) of the estimated standard deviation(s).

conf.level

numeric vector of numbers between 0 and 1 indicating the confidence level associated with the confidence interval(s). The default value is conf.level=0.95.

sample.type

character string indicating whether this is a one-sample (sample.type="one.sample") or two-sample (sample.type="two.sample") confidence interval. When sample.type="one.sample", the computed half-width is based on a confidence interval for a single mean. When sample.type="two.sample", the computed half-width is based on a confidence interval for the difference between two means. The default value is sample.type="one.sample" unless the argument n2 is supplied.

Value

a numeric vector of half-widths.

Details

If the arguments n.or.n1, n2, sigma.hat, and conf.level are not all the same length, they are replicated to be the same length as the length of the longest argument.

One-Sample Case (sample.type="one.sample") Let $\underline{x} = x_1, x_2, \ldots, x_n$ denote a vector of $n$ observations from a normal distribution with mean $\mu$ and standard deviation $\sigma$. A two-sided $(1-\alpha)100\%$ confidence interval for $\mu$ is given by: $$[\hat{\mu} - t(n-1, 1-\alpha/2) \frac{\hat{\sigma}}{\sqrt{n}}, \, \hat{\mu} + t(n-1, 1-\alpha/2) \frac{\hat{\sigma}}{\sqrt{n}}] \;\;\;\;\;\; (1)$$ where $$\hat{\mu} = \bar{x} = \frac{1}{n} \sum_{i=1}^n x_i \;\;\;\;\;\; (2)$$ $$\hat{\sigma}^2 = s^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2 \;\;\;\;\;\; (3)$$ and $t(\nu, p)$ is the $p$'th quantile of Student's t-distribution with $\nu$ degrees of freedom (Zar, 2010; Gilbert, 1987; Ott, 1995; Helsel and Hirsch, 1992). Thus, the half-width of this confidence interval is given by: $$HW = t(n-1, 1-\alpha/2) \frac{\hat{\sigma}}{\sqrt{n}} \;\;\;\;\;\; (4)$$

Two-Sample Case (sample.type="two.sample") Let $\underline{x}_1 = x_{11}, x_{12}, \ldots, x_{1n_1}$ denote a vector of $n_1$ observations from a normal distribution with mean $\mu_1$ and standard deviation $\sigma$, and let $\underline{x}_2 = x_{21}, x_{22}, \ldots, x_{2n_2}$ denote a vector of $n_2$ observations from a normal distribution with mean $\mu_2$ and standard deviation $\sigma$. A two-sided $(1-\alpha)100\%$ confidence interval for $\mu_1 - \mu_2$ is given by: $$[(\hat{\mu}_1 - \hat{\mu}_2) - t(n_1 + n_2 - 2, 1-\alpha/2) \hat{\sigma} \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}, \, (\hat{\mu}_1 - \hat{\mu}_2) + t(n_1 + n_2 - 2, 1-\alpha/2) \hat{\sigma} \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}] \;\;\;\;\;\; (5)$$ where $$\hat{\mu}_1 = \bar{x}_1 = \frac{1}{n_1} \sum_{i=1}^{n_1} x_{1i} \;\;\;\;\;\; (6)$$ $$\hat{\mu}_2 = \bar{x}_2 = \frac{1}{n_2} \sum_{i=1}^{n_2} x_{2i} \;\;\;\;\;\; (7)$$ $$\hat{\sigma}^2 = s_p^2 = \frac{(n_1 - 1) s_1^2 + (n_2 - 1) s_2^2}{n_1 + n_2 - 2} \;\;\;\;\;\; (8)$$ $$s_1^2 = \frac{1}{n_1 - 1} \sum_{i=1}^{n_1} (x_{1i} - \bar{x}_1)^2 \;\;\;\;\;\; (9)$$ $$s_2^2 = \frac{1}{n_2 - 1} \sum_{i=1}^{n_2} (x_{2i} - \bar{x}_2)^2 \;\;\;\;\;\; (10)$$ (Zar, 2010, p.142; Helsel and Hirsch, 1992, p.135, Berthouex and Brown, 2002, pp.157--158). Thus, the half-width of this confidence interval is given by: $$HW = t(n_1 + n_2 - 2, 1-\alpha/2) \hat{\sigma} \sqrt{\frac{1}{n_1} + \frac{1}{n_2}} \;\;\;\;\;\; (11)$$ Note that for the two-sample case, the function ciNormHalfWidth assumes the two populations have the same standard deviation.

References

Berthouex, P.M., and L.C. Brown. (2002). Statistics for Environmental Engineers. Second Edition. Lewis Publishers, Boca Raton, FL.

Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, New York, NY.

Helsel, D.R., and R.M. Hirsch. (1992). Statistical Methods in Water Resources Research. Elsevier, New York, NY, Chapter 7.

Millard, S.P., and N. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton, FL.

Ott, W.R. (1995). Environmental Statistics and Data Analysis. Lewis Publishers, Boca Raton, FL.

USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R-09-007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C. p.21-3.

Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ, Chapters 7 and 8.

Examples

Run this code

# NOT RUN {
  # Look at how the half-width of a one-sample confidence interval 
  # decreases with increasing sample size:

  seq(5, 30, by = 5) 
  #[1] 5 10 15 20 25 30 

  hw <- ciNormHalfWidth(n.or.n1 = seq(5, 30, by = 5)) 

  round(hw, 2) 
  #[1] 1.24 0.72 0.55 0.47 0.41 0.37

  #----------------------------------------------------------------

  # Look at how the half-width of a one-sample confidence interval 
  # increases with increasing estimated standard deviation:

  seq(0.5, 2, by = 0.5) 
  #[1] 0.5 1.0 1.5 2.0 

  hw <- ciNormHalfWidth(n.or.n1 = 20, sigma.hat = seq(0.5, 2, by = 0.5)) 

  round(hw, 2) 
  #[1] 0.23 0.47 0.70 0.94

  #----------------------------------------------------------------

  # Look at how the half-width of a one-sample confidence interval 
  # increases with increasing confidence level:

  seq(0.5, 0.9, by = 0.1) 
  #[1] 0.5 0.6 0.7 0.8 0.9 

  hw <- ciNormHalfWidth(n.or.n1 = 20, conf.level = seq(0.5, 0.9, by = 0.1)) 

  round(hw, 2) 
  #[1] 0.15 0.19 0.24 0.30 0.39

  #==========

  # Modifying the example on pages 21-4 to 21-5 of USEPA (2009), 
  # determine how adding another four months of observations to 
  # increase the sample size from 4 to 8 will affect the half-width 
  # of a two-sided 95% confidence interval for the Aldicarb level at 
  # the first compliance well.
  #  
  # Use the estimated standard deviation from the first four months 
  # of data.  (The data are stored in EPA.09.Ex.21.1.aldicarb.df.) 
  # Note that the half-width changes from 34% of the observed mean to 
  # 18% of the observed mean by increasing the sample size from 
  # 4 to 8.

  EPA.09.Ex.21.1.aldicarb.df
  #   Month   Well Aldicarb.ppb
  #1      1 Well.1         19.9
  #2      2 Well.1         29.6
  #3      3 Well.1         18.7
  #4      4 Well.1         24.2
  #...

  mu.hat <- with(EPA.09.Ex.21.1.aldicarb.df, 
    mean(Aldicarb.ppb[Well=="Well.1"]))

  mu.hat 
  #[1] 23.1 

  sigma.hat <- with(EPA.09.Ex.21.1.aldicarb.df, 
    sd(Aldicarb.ppb[Well=="Well.1"]))

  sigma.hat 
  #[1] 4.93491 

  hw.4 <- ciNormHalfWidth(n.or.n1 = 4, sigma.hat = sigma.hat) 

  hw.4 
  #[1] 7.852543 

  hw.8 <- ciNormHalfWidth(n.or.n1 = 8, sigma.hat = sigma.hat) 

  hw.8 
  #[1] 4.125688 

  100 * hw.4/mu.hat 
  #[1] 33.99369 

  100 * hw.8/mu.hat 
  #[1] 17.86012

  #==========

  # Clean up
  #---------
  rm(hw, mu.hat, sigma.hat, hw.4, hw.8)
# }

Run the code above in your browser using DataLab