p_m1: Non-principal branch probability

Description

Computes the probability that (at least) one (out of n) observation(s) of the latent variable $U$ lies in the non-principal branch region. The 'm1' in p_m1 stands for 'minus 1', i.e, the non-principal branch.

See Goerg (2011) and Details for mathematical derivations.

Usage

p_m1(gamma, beta, distname, n = 1, use.mean.variance = TRUE)

Value

non-negative float; the probability $p_{-1}$ for n observations.

Arguments

gamma: scalar; skewness parameter.
beta: numeric vector (deprecated); parameter $\boldsymbol \beta$ of the input distribution. See check_beta on how to specify beta for each distribution.
distname: character; name of input distribution; see get_distnames.
n: number of RVs/observations.
use.mean.variance: logical; if TRUE it uses mean and variance implied by $\boldsymbol \beta$ to do the transformation (Goerg 2011). If FALSE, it uses the alternative definition from Goerg (2016) with location and scale parameter.

Details

The probability that one observation of the latent RV U lies in the non-principal region equals at most $$ p_{-1}(\gamma, n=1) = P\left(U < -\frac{1}{|\gamma|}\right), $$ where $U$ is the zero-mean, unit variance version of the input $X \sim F_X(x \mid \boldsymbol \beta)$ -- see References.

For $N$ independent RVs $U_1, \ldots, U_N$, the probability that at least one data point came from the non-principal region equals $$ p_{-1}(\gamma, n=N) = P\left(U_i < -\frac{1}{|\gamma|} \; for \; at \; least \; one \; i \right) $$

This equals (assuming independence) $$ P\left(U_i < -\frac{1}{|\gamma|} \; for \; at \; least \; one \; i \right) = 1 - P\left(U_i \geq -\frac{1}{|\gamma|}, \forall i \right) = 1 - \prod_{i=1}^{N} P\left(U_i \geq -\frac{1}{|\gamma|} \right) $$ $$ = 1 - \prod_{i=1}^{N} \left(1 - p_{-1}(\gamma, n=1) \right) = 1 - (1-p_{-1}(\gamma, n=1))^N. $$

For improved numerical stability the cdf of a geometric RV (pgeom) is used to evaluate the last expression. Nevertheless, numerical problems can occur for $|\gamma| < 0.03$ (returns 0 due to rounding errors).

Note that $1 - (1-p_{-1}(\gamma, n=1))^N$ reduces to $p_{-1}(\gamma)$ for $N=1$.

Examples

Run this code


beta.01 <- c(mu = 0, sigma = 1)
# for n=1 observation
p_m1(0, beta = beta.01, distname = "normal") # identical to 0
# in theory != 0; but machine precision too low
p_m1(0.01, beta = beta.01, distname = "normal") 
p_m1(0.05, beta = beta.01, distname = "normal") # extremely small
p_m1(0.1, beta = beta.01, distname = "normal") # != 0, but very small
# 1 out of 4 samples is a non-principal input;
p_m1(1.5, beta = beta.01, distname = "normal") 
# however, gamma=1.5 is not common in practice

# for n=100 observations
p_m1(0, n=100, beta = beta.01, distname = "normal") # == 0
p_m1(0.1, n=100, beta = beta.01, distname = "normal") # still small
p_m1(0.3, n=100, beta = beta.01, distname = "normal") # a bit more likely
p_m1(1.5, n=100, beta = beta.01, distname = "normal") 
# Here we can be almost 100% sure (rounding errors) that at least one
# y_i was caused by an input in the non-principal branch.

Run the code above in your browser using DataLab