Learn R Programming

JWileymisc (version 1.4.1)

egltable: Function makes nice tables

Description

Give a dataset and a list of variables, or just the data in the vars. For best results, convert categorical variables into factors. Provides a table of estimated descriptive statistics optionally by group levels.

Usage

egltable(
  vars,
  g,
  data,
  idvar,
  strict = TRUE,
  parametric = TRUE,
  paired = FALSE,
  simChisq = FALSE,
  sims = 1000000L
)

Value

A data frame of the table.

Arguments

vars

Either an index (numeric or character) of variables to access from the data argument, or the data to be described itself.

g

A variable used tou group/separate the data prior to calculating descriptive statistics.

data

optional argument of the dataset containing the variables to be described.

idvar

A character string indicating the variable name of the ID variable. Not currently used, but will eventually support egltable supporting repeated measures data.

strict

Logical, whether to strictly follow the type of each variable, or to assume categorical if the number of unique values is less than or equal to 3.

parametric

Logical whether to use parametric tests in the case of multiple groups to test for differences. Only applies to continuous variables. If TRUE, the default, uses one-way ANOVA, and a F test. If FALSE, uses the Kruskal-Wallis test.

paired

Logical whether the data are paired or not. Defaults to FALSE. If TRUE, the grouping variable, g, must have two levels and idvar must be specified. When used a paired t-test is used for parametric, continuous data and a Wilcoxon test for paired non parametric, continuous data and a McNemar chi square test is used for categorical data.

simChisq

Logical whether to estimate p-values for chi-square test for categorical data when there are multiple groups, by simulation. Defaults to FALSE. Useful when there are small cells as will provide a more accurate test in extreme cases, similar to Fisher Exact Test but generalizing to large dimension of tables.

sims

Integer for the number of simulations to be used to estimate p-values for the chi-square tests for categorical variables when there are multiple groups. Defaults to one million (1e6L).

Examples

Run this code
egltable(iris)
egltable(colnames(iris)[1:4], "Species", data = iris)
egltable(iris, parametric = FALSE)
egltable(colnames(iris)[1:4], "Species", iris,
  parametric = FALSE)
egltable(colnames(iris)[1:4], "Species", iris,
  parametric = c(TRUE, TRUE, FALSE, FALSE))
egltable(colnames(iris)[1:4], "Species", iris,
  parametric = c(TRUE, TRUE, FALSE, FALSE), simChisq=TRUE)

diris <- data.table::as.data.table(iris)
egltable("Sepal.Length", g = "Species", data = diris)

tmp <- mtcars
tmp$cyl <- factor(tmp$cyl)
tmp$am <- factor(tmp$am, levels = 0:1)

egltable(c("mpg", "hp"), "vs", tmp)
egltable(c("mpg", "hp"), "am", tmp)
egltable(c("am", "cyl"), "vs", tmp)

tests <- with(sleep,
    wilcox.test(extra[group == 1],
           extra[group == 2], paired = TRUE))
str(tests)

## example with paired data
egltable(c("extra"), g = "group", data = sleep, idvar = "ID", paired = TRUE)

## what happens when ignoring pairing (p-value off)
# egltable(c("extra"), g = "group", data = sleep, idvar = "ID")

## paired categorical data example
## using data on chick weights to create categorical data
tmp <- subset(ChickWeight, Time %in% c(0, 20))
tmp$WeightTertile <- cut(tmp$weight,
  breaks = quantile(tmp$weight, c(0, 1/3, 2/3, 1), na.rm = TRUE),
  include.lowest = TRUE)

egltable(c("weight", "WeightTertile"), g = "Time",
  data = tmp,
  idvar = "Chick", paired = TRUE)

rm(tmp)

Run the code above in your browser using DataLab