acc_shape_or_scale: Compare observed versus expected distributions

Description

This implementation contrasts the empirical distribution of a measurement variables against assumed distributions. The approach is adapted from the idea of rootograms (Tukey 1977) which is also applicable for count data (Kleiber and Zeileis 2016).

Indicator

Usage

acc_shape_or_scale(
  resp_vars,
  dist_col,
  guess,
  par1,
  par2,
  end_digits,
  label_col,
  study_data,
  meta_data,
  flip_mode = "noflip"
)

Value

a list with:

SummaryData: data.frame underlying the plot
SummaryPlot: ggplot2 probability distribution plot
SummaryTable: data.frame with the columns Variables and FLG_acc_ud_shape

Arguments

resp_vars: variable the name of the continuous measurement variable
dist_col: variable attribute the name of the variable attribute in meta_data that provides the expected distribution of a study variable
guess: logical estimate parameters
par1: numeric first parameter of the distribution if applicable
par2: numeric second parameter of the distribution if applicable
end_digits: logical internal use. check for end digits preferences
label_col: variable attribute the name of the column in the metadata with labels of variables
study_data: data.frame the data frame that contains the measurements
meta_data: data.frame the data frame that contains metadata attributes of study data
flip_mode: enum default | flip | noflip | auto. Should the plot be in default orientation, flipped, not flipped or auto-flipped. Not all options are always supported. In general, this con be controlled by setting the roptions(dataquieR.flip_mode = ...). If called from dq_report, you can also pass flip_mode to all function calls or set them specifically using specific_args.

ALGORITHM OF THIS IMPLEMENTATION:

This implementation is restricted to data of type float or integer.
Missing codes are removed from resp_vars (if defined in the metadata)
The user must specify the column of the metadata containing probability distribution (currently only: normal, uniform, gamma)
Parameters of each distribution can be estimated from the data or are specified by the user
A histogram-like plot contrasts the empirical vs. the technical distribution

Description

Usage

Value

Arguments

ALGORITHM OF THIS IMPLEMENTATION:

See Also