sjt.frq: Summary of frequencies as HTML table

Description

Shows (multiple) frequency tables as HTML file, or saves them as file.

Usage

sjt.frq(data, weight.by = NULL, title.wtd.suffix = " (weighted)", var.labels = NULL, value.labels = NULL, sort.frq = c("none", "asc", "desc"), altr.row.col = FALSE, string.val = "value", string.cnt = "N", string.prc = "raw %", string.vprc = "valid %", string.cprc = "cumulative %", string.na = "missings", emph.md = FALSE, emph.quart = FALSE, show.summary = TRUE, show.skew = FALSE, show.kurtosis = FALSE, skip.zero = "auto", ignore.strings = TRUE, auto.group = NULL, auto.grp.strings = TRUE, max.string.dist = 3, digits = 2, CSS = NULL, encoding = NULL, file = NULL, use.viewer = TRUE, no.output = FALSE, remove.spaces = TRUE)

Arguments

data

variables which frequencies should be printed as table. Either use a single variable (vector) or a data frame where each column represents a variable (see 'Examples').

weight.by

weight factor that will be applied to weight all cases. Must be a vector of same length as the input vector. Default is NULL, so no weights are used.

title.wtd.suffix

suffix (as string) for the title, if weight.by is specified, e.g. title.wtd.suffix=" (weighted)". Default is NULL, so title will not have a suffix when cases are weighted.

var.labels

character vector with variable names, which will be used to label variables in the output.

value.labels

character vector (or list of character vectors) with value labels of the supplied variables, which will be used to label variable values in the output.

sort.frq

Determines whether categories should be sorted according to their frequencies or not. Default is "none", so categories are not sorted by frequency. Use "asc" or "desc" for sorting categories ascending or descending order.

altr.row.col

logical, if TRUE, alternating rows are highlighted with a light gray background color.

string.val

label for the very first table column containing the values (see value.labels).

string.cnt

label for the first table data column containing the counts. Default is "N".

string.prc

label for the second table data column containing the raw percentages. Default is "raw %".

string.vprc

String label for the third data table column containing the valid percentages, i.e. the count percentage value exluding possible missing values.

string.cprc

String label for the last table data column containing the cumulative percentages.

string.na

String label for the last table data row containing missing values.

emph.md

If TRUE, the table row indicating the median value will be emphasized.

emph.quart

If TRUE, the table row indicating the lower and upper quartiles will be emphasized.

show.summary

If TRUE (default), a summary row with total and valid N as well as mean and standard deviation is shown.

show.skew

If TRUE, the variable's skewness is added to the summary. The skewness is retrieved from the describe-function of the psych-package and indicated by a lower case Greek gamma.

show.kurtosis

If TRUE, the variable's kurtosis is added to the summary. The kurtosis is retrieved from the describe-function of the psych-package and indicated by a lower case Greek omega.

skip.zero

If TRUE, rows with only zero-values are not printed (e.g. if a variable has values or levels 1 to 8, and levels / values 4 to 6 have no counts, these values would not be printed in the table). Use FALSE to print also zero-values, or use "auto" (default) to detect whether it makes sense or not to print zero-values (e.g., a variable "age" with values from 10 to 100, where at least 25 percent of all possible values have no counts, zero-values would be skipped automatically).

ignore.strings

If TRUE (default), character vectors / string variables will be removed from data before frequency tables are computed.

auto.group

numeric value, indicating the minimum amount of unique values in the count variable, at which automatic grouping into smaller units is done (see group_var). Default value for auto.group is NULL, i.e. auto-grouping is off. See group_var for examples on grouping.

auto.grp.strings

if TRUE (default), string values in character vectors (string variables) are automatically grouped based on their similarity. The similarity is estimated with the stringdist-package. You can specify a distance-measure via max.string.dist argument. This argument only applies if ignore.strings is FALSE.

max.string.dist

the allowed distance of string values in a character vector, which indicates when two string values are merged because they are considered as close enough. See auto.grp.strings.

digits

numeric, amount of digits after decimal point when rounding estimates and values.

CSS

list-object with user-defined style-sheet-definitions, according to the official CSS syntax. See 'Details'.

encoding

string, indicating the charset encoding used for variable and value labels. Default is NULL, so encoding will be auto-detected depending on your platform (e.g., "UTF-8" for Unix and "Windows-1252" for Windows OS). Change encoding if specific chars are not properly displayed (e.g. German umlauts).

file

destination file, if the output should be saved as file. If NULL (default), the output will be saved as temporary file and openend either in the IDE's viewer pane or the default web browser.

use.viewer

If TRUE, the HTML table is shown in the IDE's viewer pane. If FALSE or no viewer available, the HTML table is opened in a web browser.

no.output

logical, if TRUE, the html-output is neither opened in a browser nor shown in the viewer pane and not even saved to file. This option is useful when the html output should be used in knitr documents. The html output can be accessed via the return value.

remove.spaces

logical, if TRUE, leading spaces are removed from all lines in the final string that contains the html-data. Use this, if you want to remove parantheses for html-tags. The html-source may look less pretty, but it may help when exporting html-tables to office tools.

Value

Invisibly returns

the web page style sheet (page.style),
each frequency table as web page content (page.content.list),
the complete html-output (output.complete) and
the html-table with inline-css for use with knitr (knitr)

for further use.

Details

How does the CSS-argument work? With the CSS-argument, the visual appearance of the tables can be modified. To get an overview of all style-sheet-classnames that are used in this function, see return value page.style for details. Arguments for this list have following syntax:

the class-names with "css."-prefix as argument name and
each style-definition must end with a semicolon

You can add style information to the default styles by using a + (plus-sign) as initial character for the argument attributes. Examples:

css.table = 'border:2px solid red;' for a solid 2-pixel table border in red.
css.summary = 'font-weight:bold;' for a bold fontweight in the summary row.
css.lasttablerow = 'border-bottom: 1px dotted blue;' for a blue dotted border of the last table row.
css.colnames = '+color:green' to add green color formatting to column names.
css.arc = 'color:blue;' for a blue text color each 2nd row.
css.caption = '+color:red;' to add red font-color to the default table caption style.

See further examples at sjPlot manual: sjt-basics.

Examples

Run this code

## Not run: 
# # load sample data
# library(sjmisc)
# data(efc)
# 
# # show frequencies of "e42dep" in RStudio Viewer Pane
# # or default web browser
# sjt.frq(efc$e42dep)
# 
# # plot and show frequency table of "e42dep" with labels
# sjt.frq(efc$e42dep, var.labels = "Dependency",
#         value.labels = c("independent", "slightly dependent",
#                          "moderately dependent", "severely dependent"))
# 
# # plot frequencies of e42dep, e16sex and c172code in one HTML file
# # and show table in RStudio Viewer Pane or default web browser
# # Note that value.labels of multiple variables have to be
# # list-objects
# sjt.frq(data.frame(efc$e42dep, efc$e16sex, efc$c172code),
#         var.labels = c("Dependency", "Gender", "Education"),
#         value.labels = list(c("independent", "slightly dependent",
#                               "moderately dependent", "severely dependent"),
#                             c("male", "female"), c("low", "mid", "high")))
# 
# # auto-detection of labels
# sjt.frq(data.frame(efc$e42dep, efc$e16sex, efc$c172code))
# 
# # plot larger scale including zero-counts
# # indicating median and quartiles
# sjt.frq(efc$neg_c_7, emph.md = TRUE, emph.quart = TRUE)
# 
# # sort frequencies
# sjt.frq(efc$e42dep, sort.frq = "desc")
# 
# # User defined style sheet
# sjt.frq(efc$e42dep,
#         CSS = list(css.table = "border: 2px solid;",
#                    css.tdata = "border: 1px solid;",
#                    css.firsttablecol = "color:#003399; font-weight:bold;"))## End(Not run)