stat.table: Tables of summary statistics

Description

stat.table creates tabular summaries of the data, using a limited set of functions. A list of index variables is used to cross-classify summary statistics. It does NOT work inside with()!

Usage

stat.table(index, contents = count(), data, margins = FALSE)
# S3 method for stat.table
print(x, width=7, digits,...)

Value

An object of class stat.table, which is a multi-dimensional array. A print method is available to create formatted one-way and two-way tables.

Arguments

index: A factor, or list of factors, used for cross-classification. If the list is named, then the names will be used when printing the table. This feature can be used to give informative labels to the variables.
contents: A function call, or list of function calls. Only a limited set of functions may be called (See Details below). If the list is named, then the names will be used when printing the table.
data: an optional data frame containing the variables to be tabulated. If this is omitted, the variables will be searched for in the calling environment.
margins: a logical scalar or vector indicating which marginal tables are to be calculated. If a vector, it should be the same length as the index argument: values corresponding to TRUE will be retained in marginal tables.
x: an object of class stat.table.
width: a scalar giving the minimum column width when printing.
digits: a scalar, or named vector, giving the number of digits to print after the decimal point. If a named vector is used, the names should correspond to one of the permitted functions (See Details below) and all results obtained with that function will be printed with the same precision.
...: further arguments passed to other print methods.

Author

Martyn Plummer

Details

This function is similar to tapply, with some enhancements: multiple summaries of multiple variables may be mixed in the same table; marginal tables may be calculated; columns and rows may be given informative labels; pretty printing may be controlled by the associated print method.

This function is not a replacement for tapply as it also has some limitations. The only functions that may be used in the contents argument are: count, mean, weighted.mean, sum, quantile, median, IQR, max, min, ratio, percent, and sd.

The count() function, which is the default, simply creates a contingency table of counts. The other functions are applied to each cell created by combinations of the index variables.

Examples

Run this code

data(warpbreaks)
# A one-way table
stat.table(tension,list(count(),mean(breaks)),data=warpbreaks)
# The same table with informative labels
stat.table(index=list("Tension level"=tension),list(N=count(),
           "mean number of breaks"=mean(breaks)),data=warpbreaks)

# A two-way table
stat.table(index=list(tension,wool),mean(breaks),data=warpbreaks)  
# The same table with margins over tension, but not wool
stat.table(index=list(tension,wool),mean(breaks),data=warpbreaks,
           margins=c(TRUE, FALSE))

# A table of column percentages
stat.table(list(tension,wool), percent(tension), data=warpbreaks)
# Cell percentages, with margins
stat.table(list(tension,wool),percent(tension,wool), margin=TRUE,
           data=warpbreaks)

# A table with multiple statistics
# Note how each statistic has its own default precision
a <- stat.table(index=list(wool,tension),
                contents=list(count(),mean(breaks),percent (wool)),
                data=warpbreaks)
print(a)
# Print the percentages rounded to the nearest integer
print(a, digits=c(percent=0))

Run the code above in your browser using DataLab