Categorical: Categorical Distribution Class

Description

Mathematical and statistical functions for the Categorical distribution, which is commonly used in classification supervised learning.

Arguments

Value

Returns an R6 object inheriting from class SDistribution.

Distribution support

The distribution is supported on $x_1,...,x_k$.

Default Parameterisation

Cat(elements = 1, probs = 1)

Omitted Methods

N/A

Also known as

N/A

Super classes

distr6::Distribution -> distr6::SDistribution -> Categorical

Public fields

name: Full name of distribution.
short_name: Short name of distribution for printing.
description: Brief description of the distribution.

Active bindings

properties: Returns distribution properties, including skewness type and symmetry.

Methods

Public methods

Method `new()`

Creates a new instance of this R6 class.

Usage

Categorical$new(elements = NULL, probs = NULL, decorators = NULL)

Arguments

elements: list() Categories in the distribution, see examples.

probs

numeric() Probabilities of respective categories occurring.

decorators

(character()) Decorators to add to the distribution during construction.

Examples

# Note probabilities are automatically normalised (if not vectorised)
x <- Categorical$new(elements = list("Bapple", "Banana", 2), probs = c(0.2, 0.4, 1))

# Length of elements and probabilities cannot be changed after construction x$setParameterValue(probs = c(0.1, 0.2, 0.7))

# d/p/q/r x$pdf(c("Bapple", "Carrot", 1, 2)) x$cdf("Banana") # Assumes ordered in construction x$quantile(0.42) # Assumes ordered in construction x$rand(10)

# Statistics x$mode()

summary(x)

Method `mean()`

The arithmetic mean of a (discrete) probability distribution X is the expectation $$E_X(X) = \sum p_X(x)*x$$ with an integration analogue for continuous distributions.

Usage

Categorical$mean(...)

Arguments

...: Unused.

Method `mode()`

The mode of a probability distribution is the point at which the pdf is a local maximum, a distribution can be unimodal (one maximum) or multimodal (several maxima).

Usage

Categorical$mode(which = "all")

Arguments

which: (character(1) | numeric(1) Ignored if distribution is unimodal. Otherwise "all" returns all modes, otherwise specifies which mode to return.

Method `variance()`

The variance of a distribution is defined by the formula $$var_X = E[X^2] - E[X]^2$$ where $E_X$ is the expectation of distribution X. If the distribution is multivariate the covariance matrix is returned.

Usage

Categorical$variance(...)

Arguments

...: Unused.

Method `skewness()`

The skewness of a distribution is defined by the third standardised moment, $$sk_X = E_X[\frac{x - \mu}{\sigma}^3]$$ where $E_X$ is the expectation of distribution X, $\mu$ is the mean of the distribution and $\sigma$ is the standard deviation of the distribution.

Usage

Categorical$skewness(...)

Arguments

...: Unused.

Method `kurtosis()`

The kurtosis of a distribution is defined by the fourth standardised moment, $$k_X = E_X[\frac{x - \mu}{\sigma}^4]$$ where $E_X$ is the expectation of distribution X, $\mu$ is the mean of the distribution and $\sigma$ is the standard deviation of the distribution. Excess Kurtosis is Kurtosis - 3.

Usage

Categorical$kurtosis(excess = TRUE, ...)

Arguments

excess: (logical(1)) If TRUE (default) excess kurtosis returned.

...

Unused.

Method `entropy()`

The entropy of a (discrete) distribution is defined by $$- \sum (f_X)log(f_X)$$ where $f_X$ is the pdf of distribution X, with an integration analogue for continuous distributions.

Usage

Categorical$entropy(base = 2, ...)

Arguments

base: (integer(1)) Base of the entropy logarithm, default = 2 (Shannon entropy)

...

Unused.

Method `mgf()`

The moment generating function is defined by $$mgf_X(t) = E_X[exp(xt)]$$ where X is the distribution and $E_X$ is the expectation of the distribution X.

Usage

Categorical$mgf(t, ...)

Arguments

t: (integer(1)) t integer to evaluate function at.

...

Unused.

Method `cf()`

The characteristic function is defined by $$cf_X(t) = E_X[exp(xti)]$$ where X is the distribution and $E_X$ is the expectation of the distribution X.

Usage

Categorical$cf(t, ...)

Arguments

t: (integer(1)) t integer to evaluate function at.

...

Unused.

Method `pgf()`

The probability generating function is defined by $$pgf_X(z) = E_X[exp(z^x)]$$ where X is the distribution and $E_X$ is the expectation of the distribution X.

Usage

Categorical$pgf(z, ...)

Arguments

z: (integer(1)) z integer to evaluate probability generating function at.

...

Unused.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

Categorical$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Details

The Categorical distribution parameterised with a given support set, $x_1,...,x_k$, and respective probabilities, $p_1,...,p_k$, is defined by the pmf, $$f(x_i) = p_i$$ for $p_i, i = 1,\ldots,k; \sum p_i = 1$.

Sampling from this distribution is performed with the sample function with the elements given as the support set and the probabilities from the probs parameter. The cdf and quantile assumes that the elements are supplied in an indexed order (otherwise the results are meaningless).

The number of points in the distribution cannot be changed after construction.

References

McLaughlin, M. P. (2001). A compendium of common probability distributions (pp. 2014-01). Michael P. McLaughlin.

Examples

Run this code

# NOT RUN {
## ------------------------------------------------
## Method `Categorical$new`
## ------------------------------------------------

# Note probabilities are automatically normalised (if not vectorised)
x <- Categorical$new(elements = list("Bapple", "Banana", 2), probs = c(0.2, 0.4, 1))

# Length of elements and probabilities cannot be changed after construction
x$setParameterValue(probs = c(0.1, 0.2, 0.7))

# d/p/q/r
x$pdf(c("Bapple", "Carrot", 1, 2))
x$cdf("Banana") # Assumes ordered in construction
x$quantile(0.42) # Assumes ordered in construction
x$rand(10)

# Statistics
x$mode()

summary(x)
# }

Run the code above in your browser using DataLab

Description

Arguments

Value

Distribution support

Default Parameterisation

Omitted Methods

Also known as

Super classes

Public fields

Active bindings

Methods

Public methods

Method new()

Usage

Arguments

Examples

Method mean()

Usage

Arguments

Method mode()

Usage

Arguments

Method variance()

Usage

Arguments

Method skewness()

Usage

Arguments

Method kurtosis()

Usage

Arguments

Method entropy()

Usage

Arguments

Method mgf()

Usage

Arguments

Method cf()

Usage

Arguments

Method pgf()

Usage

Arguments

Method clone()

Usage

Arguments

Details

References

See Also

Examples

Method `new()`

Method `mean()`

Method `mode()`

Method `variance()`

Method `skewness()`

Method `kurtosis()`

Method `entropy()`

Method `mgf()`

Method `cf()`

Method `pgf()`

Method `clone()`