pnt: Non-central t Probability Distribution - Algorithms and Approximations

Description

Compute different approximations for the non-central t-Distribution cumulative probability distribution function.

Usage



pntR      (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
           use.pnorm = (df > 4e5 ||
                        ncp^2 > 2*log(2)*1021), # .Machine$double.min.exp = -1022
                                          itrmax = 1000, errmax = 1e-12, verbose = TRUE)
pntR1     (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
           use.pnorm = (df > 4e5 ||
                        ncp^2 > 2*log(2)*1021),
                                          itrmax = 1000, errmax = 1e-12, verbose = TRUE)
pntP94    (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
                                          itrmax = 1000, errmax = 1e-12, verbose = TRUE)
pntP94.1  (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
                                          itrmax = 1000, errmax = 1e-12, verbose = TRUE)
pnt3150   (t, df, ncp, lower.tail = TRUE, log.p = FALSE, M = 1000, verbose = TRUE)
pnt3150.1 (t, df, ncp, lower.tail = TRUE, log.p = FALSE, M = 1000, verbose = TRUE)
pntLrg    (t, df, ncp, lower.tail = TRUE, log.p = FALSE)
pntJW39   (t, df, ncp, lower.tail = TRUE, log.p = FALSE)
pntJW39.0 (t, df, ncp, lower.tail = TRUE, log.p = FALSE)




pntVW13 (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
           keepS = FALSE, verbose = FALSE)
pntGST23_T6  (t, df, ncp, lower.tail = TRUE, log.p = FALSE,
              y1.tol = 1e-8, Mterms = 20, alt = FALSE, verbose = TRUE)
pntGST23_T6.1(t, df, ncp, lower.tail = TRUE, log.p = FALSE,
              y1.tol = 1e-8, Mterms = 20, alt = FALSE, verbose = TRUE)
## *Non*-asymptotic, (at least partly much) better version of R's Lenth(1998) algorithm
pntGST23_1(t, df, ncp, lower.tail = TRUE, log.p = FALSE,
           j0max = 1e4, # for now
           IxpqFUN = Ixpq,
           alt = FALSE, verbose = TRUE, ...)

Value

a number for pntJKBf1() and .pntJKBch1().

a numeric vector of the same length as the maximum of the lengths of

x, df, ncp for pntJKBf() and .pntJKBch().

Arguments

t

vector of quantiles (called q in pt(..)).

df

degrees of freedom (\(> 0\), maybe non-integer). df = Inf is allowed.

ncp

non-centrality parameter \(\delta \ge 0\); If omitted, use the central t distribution.

log, log.p

logical; if TRUE, probabilities p are given as log(p).

lower.tail

logical; if TRUE (default), probabilities are \(P[X \le x]\), otherwise, \(P[X > x]\).

use.pnorm

logical indicating if the pnorm() approximation of Abramowitz and Stegun (26.7.10) should be used, which is available as pntLrg().

The default corresponds to R pt()'s own behaviour (which is suboptimal).

itrmax

number of iterations / terms.

errmax: convergence bound for the iterations.

verbose: logical or integer determining the amount of diagnostic print out to the console.
M: positive integer specifying the number of terms to use in the series.
keepS: logical indicating if the function should return a list with component cdf and other informational elements, or just the CDF values directly (by default).
y1.tol: positive tolerance for warning if \(y:= t^2/(t^2 + df)\) is too close to 1 (as the formulas use \(1/(1-y)\)).
Mterms: number of summation terms for pntGST23_T6().

j0max: experimental: large integer limiting the summation terms in pntGST23_1() .
IxpqFUN: the (scaled) incomplete beta function \(I_x(p,q)\) to be used; currently, it defaults to the Ixpq function derived from Nico Temme's Maple code for “Table 1” in Gil et al. (2023).
alt: logical specifying if and how log-scale should be used. Experimental and not-yet-tested.
...: further arguments passed to IxpqFUN().

Author

Martin Maechler

Details

pntR1():

a pure R version of the (C level) code of R's own pt(), additionally giving more flexibility (via arguments use.pnorm, itrmax, errmax whose defaults here have been hard-coded in R's C code called by pt()).

This implements an improved version of the AS 243 algorithm from Lenth(1989);

R's help on non-central pt() says:: This computes the lower tail only, so the upper tail suffers from cancellation and a warning will be given when this is likely to be significant.
and (in ‘Note:’): The code for non-zero ncp is principally intended to be used for moderate values of ncp: it will not be highly accurate, especially in the tails, for large values.

pntR():

the Vectorize()d version of pntR1().

pntP94(), pntP94.1():

New versions of pntR1(), pntR(); using the Posten (1994) algorithm. pntP94() is the Vectorize()d version of pntP94.1().

pnt3150(), pnt3150.1():

Simple inefficient but hopefully correct version of pntP94..() This is really a direct implementation of formula (31.50), p.532 of Johnson, Kotz and Balakrishnan (1995)

pntLrg():

provides the pnorm() approximation (to the non-central \(t\)) from Abramowitz and Stegun (26.7.10), p.949; which should be employed only for large df and/or ncp.

pntJW39.0():

use the Jennett & Welch (1939) approximation see Johnson et al. (1995), p. 520, after (31.26a). This is still fast for huge ncp but has wrong asymptotic tail for \(|t| \to \infty\). Crucially needs \(b=\)b_chi(df).

pntJW39():

is an improved version of pntJW39.0(), using \(1-b =\)b_chi(df, one.minus=TRUE) to avoid cancellation when computing \(1 - b^2\).

%% \item{\code{pntChShP94()}:}{ .. } %% \item{\code{pntChShP94.1()}:}{ .. }

pntGST23_T6():

(and pntGST23_T6.1() for informational purposes only) use the Gil et al.(2023)'s approximation of their Theorem 6.

pntGST23_1():

implements Gil et al.(2023)'s direct pbeta() based formula (1), which is very close to Lenth's algorithm.

pntVW13():

use MM's R translation of Viktor Witkowský (2013)'s matlab implementation.

References

Johnson, N.L., Kotz, S. and Balakrishnan, N. (1995) Continuous Univariate Distributions Vol~2, 2nd ed.; Wiley; chapter 31, Section 5 Distribution Function, p.514 ff

Lenth, R. V. (1989). Algorithm AS 243 --- Cumulative distribution function of the non-central \(t\) distribution, JRSS C (Applied Statistics) 38, 185--189.

Abramowitz, M. and Stegun, I. A. (1972) Handbook of Mathematical Functions. New York: Dover; formula (26.7.10), p.949

Posten, Harry O. (1994) A new algorithm for the noncentral t distribution function, Journal of Statistical Computation and Simulation 51, 79--87; tools:::Rd_expr_doi("10.1080/00949659408811623").

-- not yet implemented --
Chattamvelli, R. and Shanmugam, R. (1994) An enhanced algorithm for noncentral t-distribution, Journal of Statistical Computation and Simulation 49, 77--83. tools:::Rd_expr_doi("10.1080/00949659408811561")

-- not yet implemented --
Akahira, Masafumi. (1995). A higher order approximation to a percentage point of the noncentral t distribution, Communications in Statistics - Simulation and Computation 24:3, 595--605; tools:::Rd_expr_doi("10.1080/03610919508813261")

Michael Perakis and Evdokia Xekalaki (2003) On a Comparison of the Efficacy of Various Approximations of the Critical Values for Tests on the Process Capability Indices CPL, CPU, and Cpk, Communications in Statistics - Simulation and Computation 32, 1249--1264; tools:::Rd_expr_doi("10.1081/SAC-120023888")

Witkovský, Viktor (2013) A Note on Computing Extreme Tail Probabilities of the Noncentral T Distribution with Large Noncentrality Parameter, Acta Universitatis Palackianae Olomucensis, Facultas Rerum Naturalium, Mathematica 52(2), 131--143.

Gil A., Segura J., and Temme N.M. (2023) New asymptotic representations of the noncentral t-distribution, Stud Appl Math. 151, 857--882; tools:::Rd_expr_doi("10.1111/sapm.12609") ; acronym “GST23”.

Examples

Run this code

tt <- seq(0, 10, len = 21)
ncp <- seq(0, 6, len = 31)
pt3R   <- outer(tt, ncp, pt, , df = 3)
pt3JKB <- outer(tt, ncp, pntR, df = 3)# currently verbose
stopifnot(all.equal(pt3R, pt3JKB, tolerance = 4e-15))# 64-bit Lnx: 2.78e-16


## Gil et al.(2023) -- Table 1 p.869
str(GST23_tab1 <- read.table(header=TRUE, text = "
 x     pnt_x_delta              Rel.accuracy   l_y   j_max
 5     0.7890745035061528e-20    0.20e-13    0.29178   254
 8     0.1902963697413609e-07    0.40e-12    0.13863   294
11     0.4649258368179092e-03    0.12e-09    0.07845   310
14     0.2912746016055676e-01    0.11e-07    0.04993   317
17     0.1858422833307925e-00    0.41e-06    0.03441   321
20     0.4434882973203470e-00    0.82e-05    0.02510   323"))

x1 <- c(5,8,11,14,17,20)
(p1  <- pt  (x1, df=10.3, ncp=20))
(p1R <- pntR(x1, df=10.3, ncp=20)) # verbose=TRUE  is default
all.equal(p1, p1R, tolerance=0) # 4.355452e-15 {on x86_64} as have *no* LDOUBLE on R level
stopifnot(all.equal(p1, p1R))
## NB: According to Gil et al., the first value (x=5) is really wrong
## p1.23 <- .. Gil et al., Table 1:
p1.23.11 <- pntGST23_T6(x1, df=10.3, ncp=20, Mterms = 11)
p1.23.20 <- pntGST23_T6(x1, df=10.3, ncp=20, Mterms = 20, verbose=TRUE)
                                        # ==> Mterms = 11 is good only for x=5
p1.23.50 <- pntGST23_T6(x1, df=10.3, ncp=20, Mterms = 50, verbose=TRUE)

x <- 4:40 ; df <- 10.3
ncp <- 20
p1     <- pt        (x, df=df, ncp=ncp)
pG1    <- pntGST23_1(x, df=df, ncp=ncp)
pG1.bR <- pntGST23_1(x, df=df, ncp=ncp,
                     IxpqFUN = \(x, l_x=.5-x+.5, p, q) Ixpq(x,l_x, p,q))
pG1.BR <- pntGST23_1(x, df=df, ncp=ncp,
                     IxpqFUN = \(x, l_x, p, q)   pbeta(x, p,q))
cbind(x, p1, pG1, pG1.bR, pG1.BR)
all.equal(pG1, p1,     tolerance=0) # 1.034 e-12
all.equal(pG1, pG1.bR, tolerance=0) # 2.497031 e-13
all.equal(pG1, pG1.BR, tolerance=0) # 2.924698 e-13
all.equal(pG1.BR,pG1.bR,tolerance=0)# 1.68644  e-13
stopifnot(exprs = {
    all.equal(pG1, p1,     tolerance = 4e-12)
    all.equal(pG1, pG1.bR, tolerance = 1e-12)
    all.equal(pG1, pG1.BR, tolerance = 1e-12)
  })

ncp <- 40 ## is  > 37.62 = "critical" for Lenth' algorithm

### --------- pntVW13() --------------------------------------------------
## length 1 arguments:
str(rr <- pntVW13(t = 1, df = 2, ncp = 3, verbose=TRUE, keepS=TRUE))
all.equal(rr$cdf, pt(1,2,3), tol = 0)#  "Mean relative difference: 4.956769e-12"
stopifnot( all.equal(rr$cdf, pt(1,2,3)) )

str(rr <- pntVW13(t = 1:19, df = 2,    ncp = 3,    verbose=TRUE, keepS=TRUE))
str(r2 <- pntVW13(t = 1,    df = 2:20, ncp = 3,    verbose=TRUE, keepS=TRUE))
str(r3 <- pntVW13(t = 1,    df = 2:20, ncp = 3:21, verbose=TRUE, keepS=TRUE))

pt1.10.5_T <- 4.34725285650591657e-5 # Ex. 7 of Witkovsky(2013)
pt1.10.5 <- pntVW13(1, 10, 5)
all.equal(pt1.10.5_T, pt1.10.5, tol = 0)# TRUE! (Lnx Fedora 40; 2024-07-04);
			# 3.117e-16 (Macbuilder R 4.4.0, macOS Ventura 13.3.1)
stopifnot(exprs = {
    identical(rr$cdf, r1 <- pntVW13(t = 1:19, df = 2, ncp = 3))
    identical(r1[1], pntVW13(1, 2, 3))
    identical(r1[7], pntVW13(7, 2, 3))
    all.equal(pt1.10.5_T, pt1.10.5, tol = 9e-16)# NB even tol=0 (64 Lnx)
})
## However, R' pt() is only equal for the very first
cbind(t = 1:19, pntVW = r1, pt = pt(1:19, 2,3))

Run the code above in your browser using DataLab