Learn R Programming

performance (version 0.3.0)

check_distribution: Classify the distribution of a model-family using machine learning

Description

Choosing the right distributional family for regression models is essential to get more accurate estimates and standard errors. This function may help to check a models' distributional family and see if the model-family probably should be reconsidered. Since it is difficult to exactly predict the correct model family, consider this function as somewhat experimental.

Usage

check_distribution(model)

Arguments

model

Typically, a model (that should response to residuals()). May also be a numeric vector.

Details

This function uses an internal random forest model to classify the distribution from a model-family. Currently, following distributions are trained (i.e. results of check_distribution() may be one of the following): "bernoulli", "beta", "beta-binomial", "binomial", "chi", "exponential", "F", "gamma", "lognormal", "normal", "negative binomial", "negative binomial (zero-inflated)", "pareto", "poisson", "poisson (zero-inflated)", "uniform" and "weibull".

Note the similarity between certain distributions according to shape, skewness, etc., for instance plot(dnorm(1:100, 30, 3)) and plot(dnorm(1:100, 30, 3)). Thus, the predicted distribution may not be perfectly representing the distributional family of the underlying fitted model, or the response value.

There is a plot(), which shows the probabilities of all predicted distributions, however, only if the probability is greater than zero.

Examples

Run this code
# NOT RUN {
library(lme4)
model <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
check_distribution(model)
plot(check_distribution(model))

# }

Run the code above in your browser using DataLab