Learn R Programming

binomSamSize (version 0.1-3)

ciss.binom: General purpose sample size calculation based on confidence interval widths

Description

Calculate necessary sample size for estimating a binomial proportion with the confidence interval computed by an arbitrary binom.confint function

Usage

ciss.binom(p0, d, alpha=0.05, ci.fun=binom.confint, np02x = function(n, p0) round(n*p0), verbose=FALSE, nStart=1,nMax=1e6,...)

Arguments

p0
hypothesized value of the parameter $p$ in the binomial distributionproportion. This is an upper bound if p0 is below 1/2, and a lower bound if p0 is above 1/2.
d
half width of the confidence interval. Note: The CI is not necessarily symmetric about the estimate so we just look at its width as determine by $d = 1/2*(CI_upper - CI_lower)$.
alpha
a two-sided $(1-\alpha)\cdot 100\%$ confidence interval is computed
ci.fun
Any binom.confint like confidence interval computing function. The default is the binom.confint function itself. In this case one would have to specify the appropriate method to use using the method argument of the binom.confint function.
np02x
A function specifying how to calculate the value of $x$ which results in an estimator of the proportion being as close as possible to the anticipated value $p_0$. Typically the value is obtained by rounding the result of $x*p0$.
verbose
If TRUE, additional output of the computations are shown. The default is FALSE.
nStart
Value where to start the search. The default n=1 can sometimes lead to wrong answers, e.g. for the Wald-type interval
nMax
Max value of the sample size $n$ to try in the iterative search. See details
...
Additional arguments sent to ci.fun function

Value

the necessary sample size n

Details

Given a pre set $\alpha$-level and an anticipated value of $p$, say $p_0$, the objective is to find the minimum sample size $n$ such that the confidence interval will lead to an interval of length $2\cdot d$.

Using ciss.binom this is done in a general purpose way by performing an iterative search for the sample size. Starting from $n=nStart$ the appropriate $x$ value, computed as round(x*p0), is found. For this integer $x$ and the current $n$ the corresponding confidence interval is computed using the function ci.fun. This function has to deliver the same type of result as the binom.confint function, i.e. a data frame containing the arguments lower and upper containing the borders of the confidence interval.

The sample size is iteratively increased until the obtained confidence interval has a length smaller than $2*d$. This might take a while if $n$ is large. It is possible to speed up the search if an appropriate nStart is provided. A brute force search is used within the function. Note that for many of the confidence intervals explicit expressions exists to calculate the necessary sample size.

See Also

binom.confint and its related functions

Examples

Run this code
#Compute the classical Wald-type interval using brute force search
#Note that nStart=2 needs to be called, because the Wald-intervals
#for x=round(1*0.5)=0 is too short.
ciss.binom(p0=1/2, d=0.1, alpha=0.05, method="asymptotic",nStart=2)
#This could of course be done easier
ciss.wald(p0=1/2, d=0.1, alpha=0.05)

#Same for the Wilson intervals
ciss.binom(p0=1/2, d=0.1, alpha=0.05, method="wilson")
ciss.wilson(p0=1/2, d=0.1, alpha=0.05)

#Now the mid-p intervals 
ciss.binom(p0=1/2, d=0.1, alpha=0.05, ci.fun=binom.midp)
#This search in Fosgate (2005) is a bit different, because interest
#is not directly in the length, but the length is used to derive
#the upper and lower limits and then a search is performed until
#the required alpha level is done. The difference is negliable
ciss.midp(p0=1/2, d=0.1, alpha=0.05)

#Another situation where no closed formula exists
ciss.binom(p0=1/2, d=0.1, alpha=0.05, method="lrt")

#Pooled samples. Now np02x is a func taking three arguments
#The k argument is provided as additional argument
np02x <- function(n,p0,k) round( (1-(1-p0)^k)*n )
ciss.binom( p0=0.1, d=0.05, alpha=0.05, ci.fun=poolbinom.lrt,
            np02x=np02x, k=10,verbose=TRUE)

Run the code above in your browser using DataLab