aov.b: Between-Subject Analysis of Variance

Description

This function performs an one-way between-subject analysis of variance (ANOVA) including Tukey HSD post hoc tests for multiple comparison and provides descriptive statistics, effect size measures, and a plot showing bars representing means for each group and error bars for difference-adjusted confidence intervals.

Usage

aov.b(formula, data, posthoc = FALSE, conf.level = 0.95, hypo = TRUE,
      descript = TRUE, effsize = FALSE, weighted = FALSE, correct = FALSE,
      digits = 2, p.digits = 3, as.na = NULL, plot = FALSE, bar = TRUE,
      point = FALSE, ci = TRUE, jitter = FALSE, adjust = TRUE,
      point.size = 3, errorbar.width = 0.1, jitter.size = 1.25,
      jitter.width = 0.05, jitter.height = 0, jitter.alpha = 0.1,
      xlab = NULL, ylab = "y", ylim = NULL, ybreaks = ggplot2::waiver(),
      title = NULL, subtitle = "Confidence Interval", filename = NULL,
      width = NA, height = NA, units = c("in", "cm", "mm", "px"), dpi = 600,
      write = NULL, append = TRUE, check = TRUE, output = TRUE)

Value

Returns an object of class misty.object, which is a list with following entries:

call: function call
type: type of analysis
data: data frame with variables used in the current analysis
formula: formula of the current analysis
args: specification of function arguments
plot: ggplot2 object for plotting the results
result: list with result tables, i.e., descript for descriptive statistics, test for the ANOVA table, posthoc for post hoc tests, and aov for the return object of the aov function

Arguments

formula: a formula of the form y ~ group where y is a numeric variable giving the data values and group a numeric variable, character variable or factor with more than two values or factor levels giving the corresponding groups.
data: a matrix or data frame containing the variables in the formula formula.
posthoc: logical: if TRUE, Tukey HSD post hoc test for multiple comparison is conducted.
conf.level: a numeric value between 0 and 1 indicating the confidence level of the interval.
hypo: logical: if TRUE (default), null and alternative hypothesis are shown on the console.
descript: logical: if TRUE (default), descriptive statistics are shown on the console.
effsize: logical: if TRUE, effect size measures $\eta^2$ and $\omega^2$ for the ANOVA and Cohen's d for the post hoc tests are shown on the console.
weighted: logical: if TRUE, the weighted pooled standard deviation is used to compute Cohen's d.
correct: logical: if TRUE, correction factor to remove positive bias in small samples is used.
digits: an integer value indicating the number of decimal places to be used for displaying descriptive statistics and confidence interval.
p.digits: an integer value indicating the number of decimal places to be used for displaying the p-value.
as.na: a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis.
plot: logical: if TRUE, a plot showing error bars for confidence intervals is drawn.
bar: logical: if TRUE (default), bars representing means for each groups are drawn.
point: logical: if TRUE, points representing means for each groups are drawn.
ci: logical: if TRUE (default), error bars representing confidence intervals are drawn.
jitter: logical: if TRUE, jittered data points are drawn.
adjust: logical: if TRUE (default), difference-adjustment for the confidence intervals is applied.
point.size: a numeric value indicating the size aesthetic for the point representing the mean value.
errorbar.width: a numeric value indicating the horizontal bar width of the error bar.
jitter.size: a numeric value indicating the size aesthetic for the jittered data points.
jitter.width: a numeric value indicating the amount of horizontal jitter.
jitter.height: a numeric value indicating the amount of vertical jitter.
jitter.alpha: a numeric value between 0 and 1 for specifying the alpha argument in the geom_histogram function for controlling the opacity of the jittered data points.
xlab: a character string specifying the labels for the x-axis.
ylab: a character string specifying the labels for the y-axis.
ylim: a numeric vector of length two specifying limits of the limits of the y-axis.
ybreaks: a numeric vector specifying the points at which tick-marks are drawn at the y-axis.
title: a character string specifying the text for the title of the plot.
subtitle: a character string specifying the text for the subtitle of the plot.
filename: a character string indicating the filename argument including the file extension in the ggsave function. Note that one of ".eps", ".ps", ".tex", ".pdf" (default), ".jpeg", ".tiff", ".png", ".bmp", ".svg" or ".wmf" needs to be specified as file extension in the filename argument. Note that plots can only be saved when plot = TRUE.
width: a numeric value indicating the width argument (default is the size of the current graphics device) in the ggsave function.
height: a numeric value indicating the height argument (default is the size of the current graphics device) in the ggsave function.
units: a character string indicating the units argument (default is in) in the ggsave function.
dpi: a numeric value indicating the dpi argument (default is 600) in the ggsave function.
write: a character string naming a text file with file extension ".txt" (e.g., "Output.txt") for writing the output into a text file.
append: logical: if TRUE (default), output will be appended to an existing text file with extension .txt specified in write, if FALSE existing text file will be overwritten.
check: logical: if TRUE (default), argument specification is checked.
output: logical: if TRUE (default), output is shown on the console.

Author

Takuya Yanagida takuya.yanagida@univie.ac.at

Details

Post Hoc Test

Tukey HSD post hoc test reports Cohen's d based on the non-weighted standard deviation (i.e., weighted = FALSE) when requesting an effect size measure (i.e., effsize = TRUE) following the recommendation by Delacre et al. (2021).

Confidence Intervals

Cumming and Finch (2005) pointed out that when 95% confidence intervals (CI) for two separately plotted means overlap, it is still possible that the CI for the difference would not include zero. Baguley (2012) proposed to adjust the width of the CIs by the factor of $\sqrt{2}$ to reflect the correct width of the CI for a mean difference:

$$\hat{\mu}_{j} \pm t_{n - 1, 1 - \alpha/2} \frac{\sqrt{2}}{2} \hat{\sigma}_{{\hat{\mu}}_j}$$

These difference-adjusted CIs around the individual means can be interpreted as if it were a CI for their difference. Note that the width of these intervals is sensitive to differences in the variance and sample size of each sample, i.e., unequal population variances and unequal n alter the interpretation of difference-adjusted CIs.

References

Baguley, T. S. (2012a). Serious stats: A guide to advanced statistics for the behavioral sciences. Palgrave Macmillan.

Cumming, G., and Finch, S. (2005) Inference by eye: Confidence intervals, and how to read pictures of data. American Psychologist, 60, 170–80.

Delacre, M., Lakens, D., Ley, C., Liu, L., & Leys, C. (2021). Why Hedges' g*s based on the non-pooled standard deviation should be reported with Welch's t-test. https://doi.org/10.31234/osf.io/tu6mp

Rasch, D., Kubinger, K. D., & Yanagida, T. (2011). Statistics in psychology - Using R and SPSS. John Wiley & Sons.

Examples

Run this code

# Example 1: Between-subject ANOVA
aov.b(mpg ~ gear, data = mtcars)

# Example 2: Between-subject ANOVA
# print effect size measures
aov.b(mpg ~ gear, data = mtcars, effsize = TRUE)

# Example 3: Between-subject ANOVA
# do not print hypotheses and descriptive statistics,
aov.b(mpg ~ gear, data = mtcars, descript = FALSE, hypo = FALSE)

# Example 4: Between-subject ANOVA
# plot results
aov.b(mpg ~ gear, data = mtcars, plot = TRUE)

if (FALSE) {
# Example 5: Write Results into a text file
aov.b(mpg ~ gear, data = mtcars, write = "ANOVA.txt")

# Example 6: Save plot
aov.b(mpg ~ gear, data = mtcars, plot = TRUE, filename = "Between-Subject_ANOVA.png",
      width = 7, height = 6)

# Example 7: Draw plot in line with the default setting of aov.b()
library(ggplot2)

object <- aov.b(mpg ~ gear, data = mtcars, output = FALSE)

ggplot(object$data, aes(group, y)) +
  geom_bar(stat = "summary", fun = "mean") +
  geom_errorbar(data = ci.mean(object$data, y, group = "group", adjust = TRUE,
                output = FALSE)$result,
                aes(group, m, ymin = low, ymax = upp), width = 0.1) +
  scale_x_discrete(name = NULL) +
  labs(subtitle = "Two-Sided Difference-Adjusted Confidence Interval") +
  theme_bw() + theme(plot.subtitle = element_text(hjust = 0.5))}

Run the code above in your browser using DataLab