Learn R Programming

⚠️There's a newer version (0.19.0) of this package.Take me there.

sjstats - Collection of Convenient Functions for Common Statistical Computations

Collection of convenient functions for common statistical computations, which are not directly provided by R's base or stats packages.

This package aims at providing, first, shortcuts for statistical measures, which otherwise could only be calculated with additional effort (like standard errors, Cronbach's Alpha or root mean squared errors), or for which currently no functions available.

Second, these shortcut functions are generic (if appropriate), and can be applied not only to vectors, but also to other objects as well (e.g., the Coefficient of Variation can be computed for vectors, linear models, or linear mixed models; the r2()-function returns the r-squared value for lm, glm, merMod, glmmTMB, or lme and other objects).

Most functions of this package are designed as summary functions, i.e. they do not transform the input vector; rather, they return a summary, which is sometimes a vector and sometimes a tidy data frame (where column names follow a common convention). The focus of most functions lies on summary statistics or fit measures for regression models, including generalized linear models, mixed effects models or Bayesian models. However, some of the functions deal with other statistical measures, like Cronbach's Alpha, Cramer's V, Phi etc.

The comprised tools include:

  • For regression and mixed models: Coefficient of Variation, Root Mean Squared Error, Residual Standard Error, Coefficient of Discrimination, R-squared and pseudo-R-squared values, standardized beta values, p-values
  • Especially for mixed models: Design effect, ICC, sample size calculation and convergence tests
  • Especially for Bayesian models: Highest Density Interval, region of practical equivalence (rope), Monte Carlo Standard Errors, ratio of number of effective samples, mediation analysis, Test for Practical Equivalence
  • Fit and accuracy measures for regression models: Overdispersion tests, accuracy of predictions, test/training-error comparisons, error rate and binned residual plots for logistic regression models
  • For anova-tables: Eta-squared, Partial Eta-squared, Omega-squared and Partial Omega-squared statistics

Furthermore, sjstats has functions to access information from model objects, which either support more model objects than their stats counterparts, or provide easy access to model attributes, like:

  • model_frame() to get the model frame,
  • model_family() to get information about the model family, link functions etc.,
  • link_inverse() to get the link-inverse function,
  • pred_vars() and resp_var() to get the names of either the dependent or independent variables, or
  • var_names() to get the "cleaned" variables names from a model object (cleaned means, things like s() or log() are removed from the returned character vector with variable names.)

Other statistics:

  • Cramer's V, Cronbach's Alpha, Mean Inter-Item-Correlation, Mann-Whitney-U-Test, Item-scale reliability tests

Documentation

Please visit https://strengejacke.github.io/sjstats/ for documentation and vignettes.

Installation

Latest development build

To install the latest development snapshot (see latest changes below), type following commands into the R console:

library(devtools)
devtools::install_github("strengejacke/sjstats")

Please note the package dependencies when installing from GitHub. The GitHub version of this package may depend on latest GitHub versions of my other packages, so you may need to install those first, if you encounter any problems. Here's the order for installing packages from GitHub:

sjlabelledsjmiscsjstatsggeffectssjPlot

Officiale, stable release

     

To install the latest stable release from CRAN, type following command into the R console:

install.packages("sjstats")

Citation

In case you want / have to cite my package, please use citation('sjstats') for citation information.

Copy Link

Version

Install

install.packages('sjstats')

Monthly Downloads

26,077

Version

0.17.4

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

March 15th, 2019

Functions in sjstats (0.17.4)

fish

Sample dataset
eta_sq

Effect size statistics for anova
mwu

Mann-Whitney-U-Test
svyglm.nb

Survey-weighted negative binomial generalised linear model
pred_accuracy

Accuracy of predictions from model fit
table_values

Expected and relative table values
.get_variance_fixed

Get fixed effects variance
link_inverse

Access information from model objects
nhanes_sample

Sample dataset from the National Health and Nutrition Examination Survey
.get_variance_random

Compute variance associated with a random-effects term (Johnson 2014)
gmd

Gini's Mean Difference
p_value

Get p-values from regression model objects
odds_to_rr

Get relative risks estimates from logistic regressions or odds ratio values
overdisp

Check overdispersion of GL(M)M's
grpmean

Summary of mean values by group
.get_variance_residual

Get residual (distribution specific) variance from random effects
deff

Design effects for two-level mixed models
efc

Sample dataset from the EUROFAMCARE project
hdi

Compute statistics for MCMC samples and Stan models
find_beta

Determining distribution parameters
icc

Intraclass-Correlation Coefficient
pca

Tidy summary of Principal Component Analysis
prop

Proportions of values in a vector
re_var

Random effect variances
inequ_trend

Compute trends in status inequalities
smpsize_lmm

Sample size for linear mixed models
reliab_test

Check internal consistency of a test or questionnaire
reexports

Objects exported from other packages
tidy_stan

Tidy summary output for stan models
se_ybar

Standard error of sample mean for mixed models
sjstats-package

Collection of Convenient Functions for Common Statistical Computations
phi

Measures of association for contingency tables
var_pop

Calculate population variance and standard deviation
std_beta

Standardized beta coefficients and CI of linear and mixed models
.get_variance_beta

Get distributional variance for beta-family
.get_variance_dispersion

Get dispersion-specific variance
is_prime

Find prime numbers
mean_n

Row means with min amount of valid values
cv

Compute model quality
robust

Robust standard errors for regression models
scale_weights

Rescale design weights for multilevel analysis
se

Standard Error for variables or coefficients
weight

Weight a variable
wtd_sd

Weighted statistics for tests and variables
cv_error

Test and training error from model cross-validation
auto_prior

Create default priors for brms-models
.badlink

helper-function, telling user if model is supported or not
.collapse_cond

glmmTMB returns a list of model information, one for conditional and one for zero-inflated part, so here we "unlist" it
boot_ci

Standard error and confidence intervals for bootstrapped estimates
bootstrap

Generate nonparametric bootstrap replications
check_assumptions

Check model assumptions
cod

Goodness-of-fit measures for regression models
converge_ok

Convergence test for mixed effects models