collapse provides the following functions to efficiently summarize and examine data:
qsu
, shorthand for quick-summary, is an extremely fast summary command inspired by the (xt)summarize command in the STATA statistical software. It computes a set of 7 statistics (nobs, mean, sd, min, max, skewness and kurtosis) using a numerically stable one-pass method. Statistics can be computed weighted, by groups, and also within-and between entities (for multilevel / panel data).
descr
computes a concise and detailed description of a data frame, including frequency tables for categorical variables and various statistics and quantiles for numeric variables. It is inspired by Hmisc::describe
, but about 10x faster.
pwcor
, pwcov
and pwnobs
compute (weighted) pairwise correlations, covariances and observation counts on matrices and data frames. Pairwise correlations and covariances can be computed together with observation counts and p-values, and output as 3D array (default) or list of matrices. A major feature of pwcor
and pwcov
is the print method displaying all of these statistics in a single correlation table.
varying
very efficiently checks for the presence of any variation in data (optionally) within groups (such as panel-identifiers).
Function / S3 Generic | Methods | Description | ||
qsu |
default, matrix, data.frame, pseries, pdata.frame |
Fast (grouped, weighted, panel-decomposed) summary statistics | ||
descr |
No methods, for data frames or lists of vectors | Detailed statistical description of data frame | ||
pwcor |
No methods, for matrices or data frames | Pairwise correlations | ||
pwcov |
No methods, for matrices or data frames | Pairwise covariances | ||
pwnobs |
No methods, for matrices or data frames | Pairwise observation counts | ||
varying |
default, matrix, data.frame, pseries, pdata.frame, grouped_df |
Fast variation check |