This function was inspired by the excellent skimr
package for R.
See the Details and Examples sections below, and the vignettes on the
modelsummary
website:
https://modelsummary.com/
https://modelsummary.com/articles/datasummary.html
datasummary_skim(
data,
output = getOption("modelsummary_output", default = "default"),
type = getOption("modelsummary_type", default = "all"),
fmt = 1,
title = getOption("modelsummary_title", default = NULL),
notes = getOption("modelsummary_notes", default = NULL),
align = getOption("modelsummary_align", default = NULL),
escape = getOption("modelsummary_escape", default = TRUE),
by = getOption("modelsummary_by", default = NULL),
fun_numeric = getOption("modelsummary_fun_numeric", default = list(Unique = NUnique,
`Missing Pct.` = PercentMissing, Mean = Mean, SD = SD, Min = Min, Median = Median,
Max = Max, Histogram = function(x) "")),
...
)
A data.frame (or tibble)
filename or object type (character string)
Supported filename extensions: .docx, .html, .tex, .md, .txt, .csv, .xlsx, .png, .jpg
Supported object types: "default", "html", "markdown", "latex", "latex_tabular", "typst", "data.frame", "tinytable", "gt", "kableExtra", "huxtable", "flextable", "DT", "jupyter". The "modelsummary_list" value produces a lightweight object which can be saved and fed back to the modelsummary
function.
The "default" output format can be set to "tinytable", "kableExtra", "gt", "flextable", "huxtable", "DT", or "markdown"
If the user does not choose a default value, the packages listed above are tried in sequence.
Session-specific configuration: options("modelsummary_factory_default" = "gt")
Persistent configuration: config_modelsummary(output = "markdown")
Warning: Users should not supply a file name to the output
argument if they intend to customize the table with external packages. See the 'Details' section.
LaTeX compilation requires the booktabs
and siunitx
packages, but siunitx
can be disabled or replaced with global options. See the 'Details' section.
String. Variables to summarize: "all", "numeric", "categorical", "dataset"
how to format numeric values: integer, user-supplied function, or modelsummary
function.
Integer: Number of decimal digits
User-supplied functions:
Any function which accepts a numeric vector and returns a character vector of the same length.
modelsummary
functions:
fmt = fmt_significant(2)
: Two significant digits (at the term-level)
fmt = fmt_sprintf("%.3f")
: See ?sprintf
fmt = fmt_identity()
: unformatted raw values
string. Cross-reference labels should be added with Quarto or Rmarkdown chunk options when applicable. When saving standalone LaTeX files, users can add a label such as \\label{tab:mytable}
directly to the title string, while also specifying escape=FALSE
.
list or vector of notes to append to the bottom of the table.
A string with a number of characters equal to the number of columns in
the table (e.g., align = "lcc"
). Valid characters: l, c, r, d.
"l": left-aligned column
"c": centered column
"r": right-aligned column
"d": dot-aligned column. For LaTeX/PDF output, this option requires at least version 3.0.25 of the siunitx LaTeX package. See the LaTeX preamble help section below for commands to insert in your LaTeX preamble.
boolean TRUE escapes or substitutes LaTeX/HTML characters which could
prevent the file from compiling/displaying. TRUE
escapes all cells, captions, and notes. Users can have more fine-grained control by setting escape=FALSE
and using an external command such as: modelsummary(model, "latex") |> tinytable::format_tt(tab, j=1:5, escape=TRUE)
Character vector of grouping variables to compute statistics over.
Named list of funtions to apply to each numeric column of data
. If fun_numeric
includes "Histogram" or "Density", inline plots are inserted. This argument is only used when type="numeric"
or "all"
.
all other arguments are passed through to the table-making
functions tinytable::tt, kableExtra::kbl, gt::gt, DT::datatable, etc. depending on the output
argument.
This allows users to pass arguments directly to datasummary
in order to
affect the behavior of other functions behind the scenes.
Since version 2.0.0, modelsummary
uses tinytable
as its default table-drawing backend.
Learn more at: https://vincentarelbundock.github.io/tinytable/",
Revert to kableExtra
for one session:
options(modelsummary_factory_default = 'kableExtra')
options(modelsummary_factory_latex = 'kableExtra')
options(modelsummary_factory_html = 'kableExtra')
The behavior of modelsummary
can be modified by setting global options. In particular, most of the arguments for most of the package's functions cna be set using global options. For example:
options(modelsummary_output = "modelsummary_list")
options(modelsummary_statistic = '({conf.low}, {conf.high})')
options(modelsummary_stars = TRUE)
Options not specific to given arguments are listed below.
These global option changes the style of the default column headers:
options(modelsummary_model_labels = "roman")
The supported styles are: "model", "arabic", "letters", "roman", "(arabic)", "(letters)", "(roman)"
modelsummary
supports 6 table-making packages: tinytable
, kableExtra
, gt
,
flextable
, huxtable
, and DT
. Some of these packages have overlapping
functionalities. To change the default backend used for a specific file
format, you can use ' the options
function:
options(modelsummary_factory_html = 'kableExtra')
options(modelsummary_factory_word = 'huxtable')
options(modelsummary_factory_png = 'gt')
options(modelsummary_factory_latex = 'gt')
options(modelsummary_factory_latex_tabular = 'kableExtra')
Change the look of tables in an automated and replicable way, using the modelsummary
theming functionality. See the vignette: https://modelsummary.com/articles/appearance.html
modelsummary_theme_gt
modelsummary_theme_kableExtra
modelsummary_theme_huxtable
modelsummary_theme_flextable
modelsummary_theme_dataframe
modelsummary
can use two sets of packages to extract information from
statistical models: the easystats
family (performance
and parameters
)
and broom
. By default, it uses easystats
first and then falls back on
broom
in case of failure. You can change the order of priorities or include
goodness-of-fit extracted by both packages by setting:
options(modelsummary_get = "easystats")
options(modelsummary_get = "broom")
options(modelsummary_get = "all")
By default, LaTeX tables enclose all numeric entries in the \num{}
command
from the siunitx package. To prevent this behavior, or to enclose numbers
in dollar signs (for LaTeX math mode), users can call:
options(modelsummary_format_numeric_latex = "plain")
options(modelsummary_format_numeric_latex = "mathmode")
A similar option can be used to display numerical entries using MathJax in HTML tables:
options(modelsummary_format_numeric_html = "mathjax")
When creating LaTeX via the tinytable
backend (default in version 2.0.0 and later), it is useful to include the following commands in the LaTeX preamble of your documents. These commands are automatically added to the preamble when compiling Rmarkdown or Quarto documents, except when the modelsummary()
calls are cached.
\usepackage{tabularray}
\usepackage{float}
\usepackage{graphicx}
\usepackage[normalem]{ulem}
\UseTblrLibrary{booktabs}
\UseTblrLibrary{siunitx}
\newcommand{\tinytableTabularrayUnderline}[1]{\underline{#1}}
\newcommand{\tinytableTabularrayStrikeout}[1]{\sout{#1}}
\NewTableCommand{\tinytableDefineColor}[3]{\definecolor{#1}{#2}{#3}}
Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. tools:::Rd_expr_doi("10.18637/jss.v103.i01").'
if (FALSE) {
dat <- mtcars
dat$vs <- as.logical(dat$vs)
dat$cyl <- as.factor(dat$cyl)
datasummary_skim(dat)
datasummary_skim(dat, type = "categorical")
}
Run the code above in your browser using DataLab