Learn R Programming

misty (version 0.6.7)

ci.prop.diff: Confidence Interval for the Difference in Proportions

Description

This function computes a confidence interval for the difference in proportions in a two-sample and paired-sample design for one or more variables, optionally by a grouping and/or split variable.

Usage

ci.prop.diff(x, ...)

# S3 method for default ci.prop.diff(x, y, method = c("wald", "newcombe"), paired = FALSE, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, group = NULL, split = NULL, sort.var = FALSE, digits = 2, as.na = NULL, write = NULL, append = TRUE, check = TRUE, output = TRUE, ...)

# S3 method for formula ci.prop.diff(formula, data, method = c("wald", "newcombe"), alternative = c("two.sided", "less", "greater"), conf.level = 0.95, group = NULL, split = NULL, sort.var = FALSE, na.omit = FALSE, digits = 2, as.na = NULL, write = NULL, append = TRUE, check = TRUE, output = TRUE, ...)

Value

Returns an object of class misty.object, which is a list with following entries:

call

function call

type

type of analysis

data

list with the input specified in x, group, and split

args

specification of function arguments

result

result table

Arguments

x

a numeric vector with 0 and 1 values.

...

further arguments to be passed to or from methods.

y

a numeric vector with 0 and 1 values.

method

a character string specifying the method for computing the confidence interval, must be one of "wald", or "newcombe" (default).

paired

logical: if TRUE, confidence interval for the difference of proportions in paired samples is computed.

alternative

a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less".

conf.level

a numeric value between 0 and 1 indicating the confidence level of the interval.

group

a numeric vector, character vector or factor as grouping variable. Note that a grouping variable can only be used when computing confidence intervals with unknown population standard deviation and population variance.

split

a numeric vector, character vector or factor as split variable. Note that a split variable can only be used when computing confidence intervals with unknown population standard deviation and population variance.

sort.var

logical: if TRUE, output table is sorted by variables when specifying group.

digits

an integer value indicating the number of decimal places to be used.

as.na

a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis. Note that as.na() function is only applied to x, but not to group or split.

write

a character string naming a text file with file extension ".txt" (e.g., "Output.txt") for writing the output into a text file.

append

logical: if TRUE (default), output will be appended to an existing text file with extension .txt specified in write, if FALSE existing text file will be overwritten.

check

logical: if TRUE (default), argument specification is checked.

output

logical: if TRUE (default), output is shown on the console.

formula

a formula of the form y ~ group for one outcome variable or cbind(y1, y2, y3) ~ group for more than one outcome variable where y is a numeric variable with 0 and 1 values and group a numeric variable, character variable or factor with two values or factor levels giving the corresponding group.

data

a matrix or data frame containing the variables in the formula formula.

na.omit

logical: if TRUE, incomplete cases are removed before conducting the analysis (i.e., listwise deletion) when specifying more than one outcome variable.

Author

Takuya Yanagida takuya.yanagida@univie.ac.at

Details

The Wald confidence interval which is based on the normal approximation to the binomial distribution are computed by specifying method = "wald", while the Newcombe Hybrid Score interval (Newcombe, 1998a; Newcombe, 1998b) is requested by specifying method = "newcombe". By default, Newcombe Hybrid Score interval is computed which have been shown to be reliable in small samples (less than n = 30 in each sample) as well as moderate to larger samples(n > 30 in each sample) and with proportions close to 0 or 1, while the Wald confidence intervals does not perform well unless the sample size is large (Fagerland, Lydersen & Laake, 2011).

References

Fagerland, M. W., Lydersen S., & Laake, P. (2011) Recommended confidence intervals for two independent binomial proportions. Statistical Methods in Medical Research, 24, 224-254.

Newcombe, R. G. (1998a). Interval estimation for the difference between independent proportions: Comparison of eleven methods. Statistics in Medicine, 17, 873-890.

Newcombe, R. G. (1998b). Improved confidence intervals for the difference between binomial proportions based on paired data. Statistics in Medicine, 17, 2635-2650.

Rasch, D., Kubinger, K. D., & Yanagida, T. (2011). Statistics in psychology - Using R and SPSS. John Wiley & Sons.

See Also

ci.prop, ci.mean, ci.mean.diff, ci.median, ci.var, ci.sd, descript

Examples

Run this code
dat1 <- data.frame(group1 = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2,
                              1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2),
                   group2 = c(1, 1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 2,
                              1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 1, 2, 2, 2),
                   group3 = c(1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2,
                              1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2),
                   x1 = c(0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, NA, 0, 0,
                          1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0),
                   x2 = c(0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1,
                          1, 0, 1, 0, 1, 1, 1, NA, 1, 0, 0, 1, 1, 1),
                   x3 = c(1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0,
                          1, 0, 1, 1, 0, 1, 1, 1, 0, 1, NA, 1, 0, 1))

#-------------------------------------------------------------------------------
# Two-sample design

# Example 1: Two-Sided 95% CI for x1 by group1
# Newcombes Hybrid Score interval
ci.prop.diff(x1 ~ group1, data = dat1)

# Example 2: Two-Sided 95% CI for x1 by group1
# Wald CI
ci.prop.diff(x1 ~ group1, data = dat1, method = "wald")

# Example 3: One-Sided 95% CI for x1 by group1
# Newcombes Hybrid Score interval
ci.prop.diff(x1 ~ group1, data = dat1, alternative = "less")

# Example 4: Two-Sided 99% CI for x1 by group1
# Newcombes Hybrid Score interval
ci.prop.diff(x1 ~ group1, data = dat1, conf.level = 0.99)

# Example 5: Two-Sided 95% CI for y1 by group1
# Newcombes Hybrid Score interval, print results with 3 digits
ci.prop.diff(x1 ~ group1, data = dat1, digits = 3)

# Example 6: Two-Sided 95% CI for y1 by group1
# Newcombes Hybrid Score interval, convert value 0 to NA
ci.prop.diff(x1 ~ group1, data = dat1, as.na = 0)

# Example 7: Two-Sided 95% CI for y1, y2, and y3 by group1
# Newcombes Hybrid Score interval
ci.prop.diff(cbind(x1, x2, x3) ~ group1, data = dat1)

# Example 8: Two-Sided 95% CI for y1, y2, and y3 by group1
# Newcombes Hybrid Score interval, listwise deletion for missing data
ci.prop.diff(cbind(x1, x2, x3) ~ group1, data = dat1, na.omit = TRUE)

# Example 9: Two-Sided 95% CI for y1, y2, and y3 by group1
# Newcombes Hybrid Score interval, analysis by group2 separately
ci.prop.diff(cbind(x1, x2, x3) ~ group1, data = dat1, group = dat1$group2)

# Example 10: Two-Sided 95% CI for y1, y2, and y3 by group1
# Newcombes Hybrid Score interval, analysis by group2 separately, sort by variables
ci.prop.diff(cbind(x1, x2, x3) ~ group1, data = dat1, group = dat1$group2,
             sort.var = TRUE)

# Example 11: Two-Sided 95% CI for y1, y2, and y3 by group1
# split analysis by group2
ci.prop.diff(cbind(x1, x2, x3) ~ group1, data = dat1, split = dat1$group2)

# Example 12: Two-Sided 95% CI for y1, y2, and y3 by group1
# Newcombes Hybrid Score interval, analysis by group2 separately, split analysis by group3
ci.prop.diff(cbind(x1, x2, x3) ~ group1, data = dat1,
             group = dat1$group2, split = dat1$group3)

#-----------------

group1 <- c(0, 1, 1, 0, 0, 1, 0, 1)
group2 <- c(1, 1, 1, 0, 0)

# Example 13: Two-Sided 95% CI for the mean difference between group1 amd group2
# Newcombes Hybrid Score interval
ci.prop.diff(group1, group2)

#-------------------------------------------------------------------------------
# Paires-sample design

dat2 <- data.frame(pre = c(0, 1, 1, 0, 1),
                   post = c(1, 1, 0, 1, 1))

# Example 14: Two-Sided 95% CI for the mean difference in x1 and x2
# Newcombes Hybrid Score interval
ci.prop.diff(dat2$pre, dat2$post, paired = TRUE)

# Example 15: Two-Sided 95% CI for the mean difference in x1 and x2
# Wald CI
ci.prop.diff(dat2$pre, dat2$post, method = "wald", paired = TRUE)

# Example 16: One-Sided 95% CI for the mean difference in x1 and x2
# Newcombes Hybrid Score interval
ci.prop.diff(dat2$pre, dat2$post, alternative = "less", paired = TRUE)

# Example 17: Two-Sided 99% CI for the mean difference in x1 and x2
# Newcombes Hybrid Score interval
ci.prop.diff(dat2$pre, dat2$post, conf.level = 0.99, paired = TRUE)

# Example 18: Two-Sided 95% CI for for the mean difference in x1 and x2
# Newcombes Hybrid Score interval, print results with 3 digits
ci.prop.diff(dat2$pre, dat2$post, paired = TRUE, digits = 3)

Run the code above in your browser using DataLab