Learn R Programming

distr6

)

What is distr6?

distr6 is a unified and clean interface to organise the probability distributions implemented in R into one R6 object oriented package, as well as adding distributions yet to implemented in R, currently we have 42 probability distributions as well as 11 kernels. Building the package from the ground up and making use of tried and tested design patterns (as per Gamma et al. 1994), distr6 aims to make probability distributions easy to use, understand and analyse.

distr6 extends the work of Peter Ruckdeschel, Matthias Kohl et al. who created the first object-oriented (OO) interface for distributions using S4. Their distr package is currently the gold-standard in R for OO distribution handling. Using R6 we aim to take this even further and to create a scalable interface that can continue to grow with the community. Full details of the API and class structure can be seen in the distr6 website.

Main Features

distr6 is not intended to replace the base R distributions function but instead to give an alternative that focuses on distributions as objects that can be manipulated and accessed as required. The main features therefore centre on OOP practices, design patterns and API design. Of particular note:

All distributions in base R introduced as objects with methods for common statistical functions including pdf, cdf, inverse cdf, simulation, mean, variance, skewness and kurtosis

B <- Binomial$new(prob = 0.5, size = 10)
B$pdf(1:10)
#>  [1] 0.0097656250 0.0439453125 0.1171875000 0.2050781250 0.2460937500
#>  [6] 0.2050781250 0.1171875000 0.0439453125 0.0097656250 0.0009765625
B$kurtosis()
#> [1] -0.2
B$rand(5)
#> [1] 7 7 4 7 6
summary(B)
#> Binomial Probability Distribution. Parameterised with:
#>   prob = 0.5, qprob = 0.5, size = 10
#>
#>   Quick Statistics
#>  Mean:       5
#>  Variance:   2.5
#>  Skewness:   0
#>  Ex. Kurtosis:   -0.2
#>
#>  Support: {0, 1,...,9, 10}   Scientific Type: ℕ0
#>
#>  Traits: discrete; univariate
#>  Properties: symmetric; platykurtic; no skew

Flexible construction of distributions for common parameterisations

Exponential$new(rate = 2)
#> Exp(rate = 2, scale = 0.5)
Exponential$new(scale = 2)
#> Exp(rate = 0.5, scale = 2)
Normal$new(mean = 0, prec = 2)
#> Norm(mean = 0, var = 0.5, sd = 0.707106781186548, prec = 2)
Normal$new(mean = 0, sd = 3)$parameters()
#>      id     value support                                 description
#> 1: mean         0       ℝ                   Mean - Location Parameter
#> 2:  var         9      ℝ+          Variance - Squared Scale Parameter
#> 3:   sd         3      ℝ+        Standard Deviation - Scale Parameter
#> 4: prec 0.1111111      ℝ+ Precision - Inverse Squared Scale Parameter

Decorators for extending functionality of distributions to more complex modelling methods

B <- Binomial$new()
decorate(B, "ExoticStatistics")
#> Binomial is now decorated with ExoticStatistics
#> Binom(prob = 0.5, qprob = 0.5, size = 10)
B$survival(2)
#> [1] 0.9453125
decorate(B, "CoreStatistics")
#> Binomial is now decorated with CoreStatistics
#> Binom(prob = 0.5, qprob = 0.5, size = 10)
B$kthmoment(6)
#> Results from numeric calculations are approximate only. Better results may be available.
#> [1] 190

S3 compatibility to make the interface more flexible for users who are less familiar with OOP

B <- Binomial$new()
mean(B) # B$mean()
#> [1] 5
variance(B) # B$variance()
#> [1] 2.5
cdf(B, 2:5) # B$cdf(2:5)
#> [1] 0.0546875 0.1718750 0.3769531 0.6230469

Wrappers including truncation, huberization and product distributions for manipulation and composition of distributions.

B <- Binomial$new()
TruncatedDistribution$new(B, lower = 2, upper = 5) #Or: truncate(B,2,5)
#> TruncBinom(Binom__prob = 0.5, Binom__qprob = 0.5,...,trunc__lower = 2, trunc__upper = 5)
N <- Normal$new()
MixtureDistribution$new(list(B,N), weights = c(0.1, 0.9))
#> Binom wX Norm
ProductDistribution$new(list(B,N))
#> Binom X Norm

Additionally set6 is used for symbolic representation of sets for Distribution typing

Binomial$new()$traits$type
#> ℕ0
Binomial$new()$properties$support
#> {0, 1,...,9, 10}

Usage

distr6 has three primary use-cases:

  1. Upgrading base Extend the R distributions functions to classes so that each distribution additionally has basic statistical methods including expectation and variance and properties/traits including discrete/continuous, univariate/multivariate, etc.
  2. Statistics Implementing decorators and adaptors to manipulate distributions including distribution composition. Additionally functionality for numeric calculations based on any arbitrary distribution.
  3. Modelling Probabilistic modelling using distr6 objects as the modelling targets. Objects as targets is an understood ML paradigm and introducing distributions as classes is the first step to implementing probabilistic modelling.

Installation

For the latest release on CRAN, install with

install.packages("distr6")

Otherwise for the latest stable build

remotes::install_github("alan-turing-institute/distr6")

Future Plans

Our plans for the next update include

  • A generalised qqplot for comparing any distributions
  • A finalised FunctionImputation decorator with different imputation strategies
  • Discrete distribution subtraction (negative convolution)
  • A wrapper for scaling distributions to a given mean and variance
  • More probability distributions
  • Any other good suggestions made between now and then!

Package Development and Contributing

distr6 is released under the MIT licence with acknowledgements to the LGPL-3 licence of distr. Therefore any contributions to distr6 will also be accepted under the MIT licence. We welcome all bug reports, issues, questions and suggestions which can be raised here but please read through our contributing guidelines for details including our code of conduct.

Acknowledgements

distr6 is the result of a collaboration between many people, universities and institutions across the world, without whom the speed and performance of the package would not be up to the standard it is. Firstly we acknowledge all the work of Prof. Dr. Peter Ruckdeschel and Prof. Dr. Matthias Kohl in developing the original distr family of packages. Secondly their significant contributions to the planning and design of distr6 including the distribution and probability family class structures. A team of undergraduates at University College London implemented many of the probability distributions and designed the plotting interface. The team consists of Shen Chen (@ShenSeanChen), Jordan Deenichin (@jdeenichin), Chengyang Gao (@garoc371), Chloe Zhaoyuan Gu (@gzy823), Yunjie He (@RoyaHe), Xiaowen Huang (@w090613), Shuhan Liu (@shliu99), Runlong Yu (@Edwinyrl), Chijing Zeng (@britneyzeng) and Qian Zhou (@yumizhou47). We also want to thank Prof. Dr. Bernd Bischl for discussions about design choices and useful features, particularly advice on the ParameterSet class. Finally University College London and The Alan Turing Institute for hosting workshops, meetings and providing coffee whenever needed.

Copy Link

Version

Install

install.packages('distr6')

Monthly Downloads

385

Version

1.6.9

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Last Published

March 27th, 2022

Functions in distr6 (1.6.9)

Convolution

Distribution Convolution Wrapper
Cauchy

Cauchy Distribution Class
Beta

Beta Distribution Class
Bernoulli

Bernoulli Distribution Class
Arcsine

Arcsine Distribution Class
ChiSquared

Chi-Squared Distribution Class
ChiSquaredNoncentral

Noncentral Chi-Squared Distribution Class
Binomial

Binomial Distribution Class
Categorical

Categorical Distribution Class
BetaNoncentral

Noncentral Beta Distribution Class
DistributionDecorator

Abstract DistributionDecorator Class
Empirical

Empirical Distribution Class
EmpiricalMV

EmpiricalMV Distribution Class
DiscreteUniform

Discrete Uniform Distribution Class
Distribution

Generalised Distribution Object
DistributionWrapper

Abstract DistributionWrapper Class
Dirichlet

Dirichlet Distribution Class
Degenerate

Degenerate Distribution Class
Cosine

Cosine Kernel
CoreStatistics

Core Statistical Methods Decorator
Erlang

Erlang Distribution Class
Epanechnikov

Epanechnikov Kernel
FDistribution

'F' Distribution Class
FDistributionNoncentral

Noncentral F Distribution Class
FunctionImputation

Imputed Pdf/Cdf/Quantile/Rand Functions Decorator
Frechet

Frechet Distribution Class
ExoticStatistics

Exotic Statistical Methods Decorator
Exponential

Exponential Distribution Class
Geometric

Geometric Distribution Class
Gamma

Gamma Distribution Class
Laplace

Laplace Distribution Class
Logarithmic

Logarithmic Distribution Class
LogisticKernel

Logistic Kernel
Logistic

Logistic Distribution Class
HuberizedDistribution

Distribution Huberization Wrapper
Hypergeometric

Hypergeometric Distribution Class
Gumbel

Gumbel Distribution Class
Gompertz

Gompertz Distribution Class
InverseGamma

Inverse Gamma Distribution Class
Kernel

Abstract Kernel Class
Pareto

Pareto Distribution Class
NormalKernel

Normal Kernel
Matdist

Matdist Distribution Class
MixtureDistribution

Mixture Distribution Wrapper
Multinomial

Multinomial Distribution Class
MultivariateNormal

Multivariate Normal Distribution Class
Loglogistic

Log-Logistic Distribution Class
Lognormal

Log-Normal Distribution Class
SDistribution

Abstract Special Distribution Class
ShiftedLoglogistic

Shifted Log-Logistic Distribution Class
Poisson

Poisson Distribution Class
StudentTNoncentral

Noncentral Student's T Distribution Class
StudentT

Student's T Distribution Class
ProductDistribution

Product Distribution Wrapper
Sigmoid

Sigmoid Kernel
Normal

Normal Distribution Class
NegativeBinomial

Negative Binomial Distribution Class
Quartic

Quartic Kernel
Triweight

Triweight Kernel
Tricube

Tricube Kernel
Triangular

Triangular Distribution Class
TriangularKernel

Triangular Kernel
Weibull

Weibull Distribution Class
decorate

Decorate Distributions
c.Matdist

Combine Matrix Distributions into a Matdist
dstr

Helper Functionality for Constructing Distributions
Wald

Wald Distribution Class
c.Distribution

Combine Distributions into a VectorDistribution
as.VectorDistribution

Coercion to Vector Distribution
distr6News

Show distr6 NEWS.md File
UniformKernel

Uniform Kernel
Silverman

Silverman Kernel
lines.Distribution

Superimpose Distribution Functions Plots for a distr6 Object
testParameterSet

assert/check/test/ParameterSet
[.VectorDistribution

Extract one or more Distributions from a VectorDistribution
distrSimulate

Simulate from a Distribution
testContinuous

assert/check/test/Continuous
testParameterSetList

assert/check/test/ParameterSetList
truncate

Truncate a Distribution
gprm

Helper Functionality for Getting and Setting Distribution Parameters
as.ProductDistribution

Coercion to Product Distribution
as.MixtureDistribution

Coercion to Mixture Distribution
generalPNorm

Generalised P-Norm
exkurtosisType

Kurtosis Type
TruncatedDistribution

Distribution Truncation Wrapper
Rayleigh

Rayleigh Distribution Class
Uniform

Uniform Distribution Class
plot.Matdist

Plotting Distribution Functions for a Matrix Distribution
plot.Distribution

Plot Distribution Functions for a distr6 Object
qqplot

Quantile-Quantile Plots for distr6 Objects
[.Matdist

Extract one or more Distributions from a Matdist
plot.VectorDistribution

Plotting Distribution Functions for a VectorDistribution
skewType

Skewness Type
testMixture

assert/check/test/Mixture
testMultivariate

assert/check/test/Multivariate
listDecorators

Lists Implemented Distribution Decorators
testNegativeSkew

assert/check/test/NegativeSkew
listWrappers

Lists Implemented Distribution Wrappers
makeUniqueDistributions

De-Duplicate Distribution Names
testNoSkew

assert/check/test/NoSkew
VectorDistribution

Vectorise Distributions
distr6-deprecated

Deprecated distr6 Functions and Classes
simulateEmpiricalDistribution

Sample Empirical Distribution Without Replacement
testMesokurtic

assert/check/test/Mesokurtic
rep.Distribution

Replicate Distribution into Vector, Mixture, or Product
testMatrixvariate

assert/check/test/Matrixvariate
as.Distribution

Coerce matrix to vector of WeightedDiscrete or Matrix Distribution
WeightedDiscrete

WeightedDiscrete Distribution Class
huberize

Huberize a Distribution
distr6-package

distr6: Object Oriented Distributions in R
length.VectorDistribution

Get Number of Distributions in Vector Distribution
testSymmetric

assert/check/test/Symmetric
mixturiseVector

Create Mixture Distribution From Multiple Vectors
mixMatrix

Mix Matrix Distributions into a new Matdist
testDistributionList

assert/check/test/DistributionList
listDistributions

Lists Implemented Distributions
testDiscrete

assert/check/test/Discrete
listKernels

Lists Implemented Kernels
testDistribution

assert/check/test/Distribution
testPlatykurtic

assert/check/test/Platykurtic
testLeptokurtic

assert/check/test/Leptokurtic
testPositiveSkew

assert/check/test/PositiveSkew
testUnivariate

assert/check/test/Univariate