Learn R Programming

EnvStats (version 2.1.0)

propTestPower: Compute the Power of a One- or Two-Sample Proportion Test

Description

Compute the power of a one- or two-sample proportion test, given the sample size(s), true proportion(s), and significance level.

Usage

propTestPower(n.or.n1, p.or.p1 = 0.5, n2 = n.or.n1, 
    p0.or.p2 = 0.5, alpha = 0.05, sample.type = "one.sample", 
    alternative = "two.sided", approx = TRUE, 
    correct = sample.type == "two.sample", warn = TRUE, 
    return.exact.list = TRUE)

Arguments

n.or.n1
numeric vector of sample sizes. When sample.type="one.sample", this argument denotes $n$, the number of observations in the single sample. When sample.type="two.sample", this argument denotes $n_1$, the number of ob
p.or.p1
numeric vector of proportions. When sample.type="one.sample", this argument denotes the true value of $p$, the probability of dQuote{success}. When sample.type="two.sample", this argument denotes the value of $p_1$, th
n2
numeric vector of sample sizes for group 2. The default value is n2=n.or.n1. This argument is ignored when sample.type="one.sample". Missing (NA), undefined (NaN), and infinite (Inf
p0.or.p2
numeric vector of proportions. When sample.type="one.sample", this argument denotes the hypothesized value of $p$, the probability of dQuote{success}. When sample.type="two.sample", this argument denotes the value of $p
alpha
numeric vector of numbers between 0 and 1 indicating the Type I error level associated with the hypothesis test. The default value is alpha=0.05.
sample.type
character string indicating whether to compute power based on a one-sample or two-sample hypothesis test. When sample.type="one.sample", the computed power is based on a hypothesis test for a single proportion. When sampl
alternative
character string indicating the kind of alternative hypothesis. The possible values are "two.sided" (the default), "less", and "greater".
approx
logical scalar indicating whether to compute the power based on the normal approximation to the binomial distribution. The default value is approx=TRUE. Currently, the exact method (approx=FALSE) is only available for t
correct
logical scalar indicating whether to use the continuity correction when approx=TRUE. The default value is approx=TRUE when sample.type="two.sample" and approx=FALSE when sample.type="one.sample
warn
logical scalar indicating whether to issue a warning. The default value is warn=TRUE. When approx=TRUE (power based on the normal approximation) and warn=TRUE, a warning is issued for cases when the normal app
return.exact.list
logical scalar relevant to the case when approx=FALSE (i.e., when the power is based on the exact test). This argument indicates whether to return a list containing extra information about the exact test in addition to the power

Value

  • By default, propTestPower returns a numeric vector of powers. For the one-sample proportion test (sample.type="one.sample"), when approx=FALSE and return.exact.list=TRUE, propTestPower returns a list with the following components:
  • powernumeric vector of powers.
  • alphanumeric vector containing the true significance levels. Because of the discrete nature of the binomial distribution, the true significance levels usually do not equal the significance level supplied by the user in the argument alpha.
  • q.critical.lowernumeric vector of lower critical values for rejecting the null hypothesis. If the observed number of "successes" is less than or equal to these values, the null hypothesis is rejected. (Not present if alternative="greater".)
  • q.critical.uppernumeric vector of upper critical values for rejecting the null hypothesis. If the observed number of "successes" is greater than these values, the null hypothesis is rejected. (Not present if alternative="less".)

Details

If the arguments n.or.n1, p.or.p1, n2, p0.or.p2, and alpha are not all the same length, they are replicated to be the same length as the length of the longest argument. The power is based on the difference p.or.p1 - p0.or.p2. One-Sample Case (sample.type="one.sample"). [object Object],[object Object] Two-Sample Case (sample.type="two.sample"). When sample.type="two.sample", power is computed based on the test that uses the normal approximation to the binomial distribution; see the help file for prop.test. The formula for this test and its associated power is presented in standard statistics texts, including Zar (2010, pp. 549-550, 552-553) and Millard and Neerchal (2001, pp. 443-445, 508-510).

References

Berthouex, P.M., and L.C. Brown. (1994). Statistics for Environmental Engineers. Lewis Publishers, Boca Raton, FL, Chapter 15. Casagrande, J.T., M.C. Pike, and P.G. Smith. (1978). An Improved Approximation Formula for Calculating Sample Sizes for Comparing Two Binomial Distributions. Biometrics 34, 483-486. Fleiss, J. L. (1981). Statistical Methods for Rates and Proportions. Second Edition. John Wiley and Sons, New York, Chapters 1-2. Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, New York, NY. Haseman, J.K. (1978). Exact Sample Sizes for Use with the Fisher-Irwin Test for 2x2 Tables. Biometrics 34, 106-109. Millard, S.P., and N. Neerchal. (2001). Environmental Statistics with S-Plus. CRC Press, Boca Raton, FL. Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ.

See Also

propTestN, propTestMdd, plotPropTestDesign, prop.test, binom.test.

Examples

Run this code
# Look at how the power of the one-sample proportion test 
  # increases with increasing sample size:

  seq(20, 50, by=10) 
  #[1] 20 30 40 50 

  power <- propTestPower(n.or.n1 = seq(20, 50, by=10), 
    p.or.p1 = 0.7, p0 = 0.5) 

  round(power, 2) 
  #[1] 0.43 0.60 0.73 0.83

  #----------

  # Repeat the last example, but compute the power based on 
  # the exact test instead of the approximation. 
  # Note that the significance level varies with sample size and 
  # never attains the requested level of 0.05.

  prop.test.list <- propTestPower(n.or.n1 = seq(20, 50, by=10), 
    p.or.p1 = 0.7, p0 = 0.5, approx=FALSE) 

  lapply(prop.test.list, round, 2) 
  #$power: 
  #[1] 0.42 0.59 0.70 0.78 
  #
  #$alpha: 
  #[1] 0.04 0.04 0.04 0.03 
  #
  #$q.critical.lower: 
  #[1] 5 9 13 17 
  #
  #$q.critical.upper: 
  #[1] 14 20 26 32

  #==========

  # Look at how the power of the two-sample proportion test 
  # increases with increasing difference between the two 
  # population proportions:

  seq(0.5, 0.1, by=-0.1) 
  #[1] 0.5 0.4 0.3 0.2 0.1 

  power <- propTestPower(30, sample.type = "two", 
    p.or.p1 = seq(0.5, 0.1, by=-0.1)) 
  #Warning message:
  #In propTestPower(30, sample.type = "two", p.or.p1 = seq(0.5, 0.1,  :
  #The sample sizes 'n1' and 'n2' are too small, relative to the given 
  # values of 'p1' and 'p2', for the normal approximation to work well 
  # for the following element indices:
  #         5 

  round(power, 2) 
  #[1] 0.05 0.08 0.26 0.59 0.90

  #----------

  # Look at how the power of the two-sample proportion test 
  # increases with increasing values of Type I error:

  power <- propTestPower(30, sample.type = "two", 
    p.or.p1 = 0.7, 
    alpha = c(0.001, 0.01, 0.05, 0.1)) 

  round(power, 2) 
  #[1] 0.02 0.10 0.26 0.37

  #==========

  # Clean up
  #---------
  rm(power, prop.test.list)

  #==========

  # Modifying the example on pages 8-5 to 8-7 of USEPA (1989b), 
  # determine how adding another 20 observations to the background 
  # well to increase the sample size from 24 to 44 will affect the 
  # power of detecting a difference in the proportion of detects of 
  # cadmium between the background and compliance wells.  Set the 
  # compliance well to "group 1" and set the background well to 
  # "group 2".  Assume the true probability of a "detect" at the 
  # background well is 1/3, set the probability of a "detect" at the 
  # compliance well to 0.4, use a 5% significance level, and use the 
  # upper one-sided alternative (probability of a "detect" at the 
  # compliance well is greater than the probability of a "detect" at 
  # the background well). 
  # (The original data are stored in EPA.89b.cadmium.df.) 
  #
  # Note that the power does increase (from 9% to 12%), but is relatively 
  # very small.

  EPA.89b.cadmium.df
  #   Cadmium.orig Cadmium Censored  Well.type
  #1           0.1   0.100    FALSE Background
  #2          0.12   0.120    FALSE Background
  #3           BDL   0.000     TRUE Background
  # ..........................................
  #86          BDL   0.000     TRUE Compliance
  #87          BDL   0.000     TRUE Compliance
  #88          BDL   0.000     TRUE Compliance


  p.hat.back <- with(EPA.89b.cadmium.df, 
    mean(!Censored[Well.type=="Background"])) 

  p.hat.back 
  #[1] 0.3333333 

  p.hat.comp <- with(EPA.89b.cadmium.df, 
    mean(!Censored[Well.type=="Compliance"])) 

  p.hat.comp 
  #[1] 0.375 

  n.back <- with(EPA.89b.cadmium.df, 
    sum(Well.type == "Background")) 

  n.back 
  #[1] 24 

  n.comp <- with(EPA.89b.cadmium.df, 
    sum(Well.type == "Compliance")) 

  n.comp 
  #[1] 64 

  propTestPower(n.or.n1 = n.comp, 
    p.or.p1 = 0.4, 
    n2 = c(n.back, 44), p0.or.p2 = p.hat.back, 
    alt="greater", sample.type="two") 
  #[1] 0.08953013 0.12421135

  #----------

  # Clean up
  #---------
  rm(p.hat.back, p.hat.comp, n.back, n.comp)

Run the code above in your browser using DataLab