Learn R Programming

misty (version 0.7.0)

ci.prop.diff: Confidence Interval for the Difference in Proportions

Description

This function computes a confidence interval for the difference in proportions in a two-sample and paired-sample design for one or more variables, optionally by a grouping and/or split variable.

Usage

ci.prop.diff(x, ...)

# S3 method for default ci.prop.diff(x, y, method = c("wald", "newcombe"), paired = FALSE, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, group = NULL, split = NULL, sort.var = FALSE, digits = 2, as.na = NULL, write = NULL, append = TRUE, check = TRUE, output = TRUE, ...)

# S3 method for formula ci.prop.diff(formula, data, method = c("wald", "newcombe"), alternative = c("two.sided", "less", "greater"), conf.level = 0.95, group = NULL, split = NULL, sort.var = FALSE, na.omit = FALSE, digits = 2, as.na = NULL, write = NULL, append = TRUE, check = TRUE, output = TRUE, ...)

Value

Returns an object of class misty.object, which is a list with following entries:

call

function call

type

type of analysis

data

list with the input specified in x, group, and split

args

specification of function arguments

result

result table

Arguments

x

a numeric vector with 0 and 1 values.

...

further arguments to be passed to or from methods.

y

a numeric vector with 0 and 1 values.

method

a character string specifying the method for computing the confidence interval, must be one of "wald", or "newcombe" (default).

paired

logical: if TRUE, confidence interval for the difference of proportions in paired samples is computed.

alternative

a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less".

conf.level

a numeric value between 0 and 1 indicating the confidence level of the interval.

group

a numeric vector, character vector or factor as grouping variable. Note that a grouping variable can only be used when computing confidence intervals with unknown population standard deviation and population variance.

split

a numeric vector, character vector or factor as split variable. Note that a split variable can only be used when computing confidence intervals with unknown population standard deviation and population variance.

sort.var

logical: if TRUE, output table is sorted by variables when specifying group.

digits

an integer value indicating the number of decimal places to be used.

as.na

a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis. Note that as.na() function is only applied to x, but not to group or split.

write

a character string naming a text file with file extension ".txt" (e.g., "Output.txt") for writing the output into a text file.

append

logical: if TRUE (default), output will be appended to an existing text file with extension .txt specified in write, if FALSE existing text file will be overwritten.

check

logical: if TRUE (default), argument specification is checked.

output

logical: if TRUE (default), output is shown on the console.

formula

a formula of the form y ~ group for one outcome variable or cbind(y1, y2, y3) ~ group for more than one outcome variable where y is a numeric variable with 0 and 1 values and group a numeric variable, character variable or factor with two values or factor levels giving the corresponding group.

data

a matrix or data frame containing the variables in the formula formula.

na.omit

logical: if TRUE, incomplete cases are removed before conducting the analysis (i.e., listwise deletion) when specifying more than one outcome variable.

Author

Takuya Yanagida takuya.yanagida@univie.ac.at

Details

The Wald confidence interval which is based on the normal approximation to the binomial distribution are computed by specifying method = "wald", while the Newcombe Hybrid Score interval (Newcombe, 1998a; Newcombe, 1998b) is requested by specifying method = "newcombe". By default, Newcombe Hybrid Score interval is computed which have been shown to be reliable in small samples (less than n = 30 in each sample) as well as moderate to larger samples(n > 30 in each sample) and with proportions close to 0 or 1, while the Wald confidence intervals does not perform well unless the sample size is large (Fagerland, Lydersen & Laake, 2011).

References

Fagerland, M. W., Lydersen S., & Laake, P. (2011) Recommended confidence intervals for two independent binomial proportions. Statistical Methods in Medical Research, 24, 224-254.

Newcombe, R. G. (1998a). Interval estimation for the difference between independent proportions: Comparison of eleven methods. Statistics in Medicine, 17, 873-890.

Newcombe, R. G. (1998b). Improved confidence intervals for the difference between binomial proportions based on paired data. Statistics in Medicine, 17, 2635-2650.

Rasch, D., Kubinger, K. D., & Yanagida, T. (2011). Statistics in psychology - Using R and SPSS. John Wiley & Sons.

See Also

ci.prop, ci.mean, ci.mean.diff, ci.median, ci.var, ci.sd, descript

Examples

Run this code
#----------------------------------------------------------------------------
# Two-sample design

# Example 1a: Two-Sided 95% CI for 'vs' by 'am'
# Newcombes Hybrid Score interval
ci.prop.diff(vs ~ am, data = mtcars)

# Example 1b: Two-Sided 95% CI for 'vs' by 'am'
# Wald CI
ci.prop.diff(vs ~ am, data = mtcars, method = "wald")

# Example 1c: Two-Sided 95% CI for the difference in proportions
# Newcombes Hybrid Score interval
ci.prop.diff(c(0, 1, 1, 0, 0, 1, 0, 1), c(1, 1, 1, 0, 0))

#----------------------------------------------------------------------------
# Paired-sample design

dat.p <- data.frame(pre = c(0, 1, 1, 0, 1), post = c(1, 1, 0, 1, 1))

# Example 2a: Two-Sided 95% CI for the difference in proportions 'pre' and 'post'
# Newcombes Hybrid Score interval
ci.prop.diff(dat.p$pre, dat.p$post, paired = TRUE)

# Example 2b: Two-Sided 95% CI for the difference in proportions 'pre' and 'post'
# Wald CI
ci.prop.diff(dat.p$pre, dat.p$post, method = "wald", paired = TRUE)

Run the code above in your browser using DataLab