dSnbinom: Distribution of the sum independent negative binomial random variables.

Description

Distribution of the sum of random variable with negative binomial distributions.
Technically the sum of random variable with negative binomial distributions is a convolution of negative binomial random variables.
dSnbinom returns the density for the sum of random variable with negative binomial distributions.
pSnbinom returns the distribution function for the sum of random variable with negative binomial distributions.
qSnbinom returns the quantile function for the sum of random variable with negative binomial distributions.
rSnbinom returns random numbers for the sum of random variable with negative binomial distributions.
If all prob values are the same, exact probabilities are estimated.
Estimate using Vellaisamy&Upadhye method uses parallel computing depending on value of parallel. The number of cores in usage can be defined using options(mc.cores = c) with c being the number of cores to be used. By default it will use all the available cores. Forking will be used in Unix system and no forking on Windows systems.
When Furman method is in use, it will return the progress of Pr(S = x) during recursion in an attribute if verbose is TRUE (see examples).

Usage

dSnbinom(
  x = stop("You must provide at least one x value"),
  size = NULL,
  prob = NULL,
  mu = NULL,
  log = FALSE,
  tol = NULL,
  method = "Furman",
  normalize = TRUE,
  max.iter = NULL,
  mean = NULL,
  sd = NULL,
  n.random = 1e+06,
  parallel = FALSE,
  verbose = FALSE
)
pSnbinom(
  q = stop("At least one quantile must be provided"),
  size = NULL,
  prob = NULL,
  mu = NULL,
  lower.tail = TRUE,
  log.p = FALSE,
  tol = NULL,
  method = "Furman",
  normalize = TRUE
)
qSnbinom(
  p = stop("At least one probability must be provided"),
  size = stop("size parameter is mandatory"),
  prob = NULL,
  mu = NULL,
  lower.tail = TRUE,
  log.p = FALSE,
  tol = NULL,
  method = "Furman"
)
rSnbinom(n = 1, size = NULL, prob = NULL, mu = NULL)

Value

dSnbinom gives the density, pSnbinom gives the distribution function, qSnbinom gives the quantile function, and rSnbinom generates random deviates.

Arguments

x: vector of (non-negative integer) quantiles.
size: target for number of successful trials, or dispersion parameter (the shape parameter of the gamma mixing distribution). Must be strictly positive, need not be integer.
prob: probability of success in each trial. 0 < prob <= 1.
mu: alternative parametrization via mean.
log, log.p: logical; if TRUE, probabilities p are given as log(p).
tol: Tolerance for recurrence for Furman (2007) method. If NULL, will use a saddlepoint estimation.
method: Can be Furman (default), Vellaisamy&Upadhye or exact, approximate.normal, approximate.negativebinomial, approximate.RandomObservations, or saddlepoint.
normalize: If TRUE (default) will normalize the saddlepoint approximation estimate.
max.iter: Number of maximum iterations for Furman method. Can be NULL.
mean: Mean of the distribution for approximate.normal method. If NULL, the theoretical mean will be used.
sd: Standard deviation of the distribution for approximate.normal method. If NULL, the theoretical sd will be used.
n.random: Number of random numbers used to estimate parameters of distribution for approximate.RandomObservations method.
parallel: logical; if FALSE (default), parallel computing is not used for Vellaisamy&Upadhye methods.
verbose: Give more information on the method.
q: vector of quantiles.
lower.tail: logical; if TRUE (default), probabilities are P[X <= x], otherwise, P[X > x].
p: vector of probabilities.
n: number of observations.

Functions

dSnbinom(): Density for the sum of random variable with negative binomial distributions.
pSnbinom(): Distribution function for the sum of random variable with negative binomial distributions.
qSnbinom(): Quantile function for the sum of random variable with negative binomial distributions.
rSnbinom(): Random numbers for the sum of random variable with negative binomial distributions.

Author

Marc Girondot marc.girondot@gmail.com and Jon Barry jon.barry@cefas.gov.uk

Details

Distribution of the Sum of Independent Negative Binomial Random Variables.

References

Furman, E., 2007. On the convolution of the negative binomial random variables. Statistics & Probability Letters 77, 169-172.

Vellaisamy, P. & Upadhye, N.S. 2009. On the sums of compound negative binomial and gamma random variables. Journal of Applied Probability, 46, 272-283.

Girondot M, Barry J. 2023. Computation of the distribution of the sum of independent negative binomial random variables. Mathematical and Computational Applications 2023, 28, 63, doi:10.3390/mca28030063

Examples

Run this code

if (FALSE) {
library(HelpersMG)
alpha <- c(1, 2, 5, 1, 2)
p <- c(0.1, 0.12, 0.13, 0.14, 0.14)
# By default, the Furman method with tol=1E-40 is used
dSnbinom(20, size=alpha, prob=p)
# Note the attribute is the dynamics of convergence of Pr(X=x)
attributes(dSnbinom(20, size=alpha, prob=p, verbose=TRUE))$Pk[, 1]

mutest <- c(0.01, 0.02, 0.03)
sizetest <- 2
x <- 20
# Exact probability
dSnbinom(x, size=sizetest, mu=mutest, method="vellaisamy&upadhye")
dSnbinom(x, size=sizetest, mu=mutest, method="vellaisamy&upadhye", log=TRUE)
# With Furman method and tol=1E-12, when probability 
# is very low, it will be biased
dSnbinom(x, size=sizetest, mu=mutest, method="Furman", tol=1E-12)
# The solution is to use a tolerance lower than the estimate
dSnbinom(x, size=sizetest, mu=mutest, method="Furman", tol=1E-45)
# Here the estimate used a first estimation by saddlepoint approximation
dSnbinom(x, size=sizetest, mu=mutest, method="Furman", tol=NULL)
# Or a huge number of iterations; but it is not the best solution
dSnbinom(x, size=sizetest, mu=mutest, method="Furman", 
         tol=1E-12, max.iter=10000)
# With the saddle point approximation method
dSnbinom(x, size=sizetest, mu=mutest, method="saddlepoint", log=FALSE)
dSnbinom(x, size=sizetest, mu=mutest, method="saddlepoint", log=TRUE)

# Another example
sizetest <- c(1, 1, 0.1)
mutest <- c(2, 1, 10)
x <- 5
(exact <- dSnbinom(x=x, size=sizetest, mu=mutest, method="Vellaisamy&Upadhye"))
(sp <- dSnbinom(x=x, size=sizetest, mu=mutest, method="saddlepoint"))
paste0("Saddlepoint approximation: Error of ", specify_decimal(100*abs(sp-exact)/exact, 2), "%")
(furman <- dSnbinom(x=x, size=sizetest, mu=mutest, method="Furman"))
paste0("Inversion of mgf: Error of ", specify_decimal(100*abs(furman-exact)/exact, 2), "%")
(na <- dSnbinom(x=x, size=sizetest, mu=mutest, method="approximate.normal")) 
paste0("Gaussian approximation: Error of ", specify_decimal(100*abs(na-exact)/exact, 2), "%")
(nb <- dSnbinom(x=x, size=sizetest, mu=mutest, method="approximate.negativebinomial"))
paste0("NB approximation: Error of ", specify_decimal(100*abs(nb-exact)/exact, 2), "%")

plot(0:20, dSnbinom(0:20, size=sizetest, mu=mutest, method="furman"), bty="n", type="h", 
     xlab="x", ylab="Density", ylim=c(0, 0.2), las=1)
points(x=0:20, y=dSnbinom(0:20, size=sizetest, mu=mutest, 
                           method="saddlepoint"), pch=1, col="blue")
points(x=0:20, y=dSnbinom(0:20, size=sizetest, mu=mutest, 
                           method="approximate.negativebinomial"), 
                           col="red")
points(x=0:20, y=dSnbinom(0:20, size=sizetest, mu=mutest, 
                           method="approximate.normal"), 
                           col="green")

# Test with a single distribution
dSnbinom(20, size=1, mu=20)
# when only one distribution is available, it is the same as dnbinom()
dnbinom(20, size=1, mu=20)

# If a parameter is supplied as only one value, it is supposed to be constant
dSnbinom(20, size=1, mu=c(14, 15, 10))
dSnbinom(20, size=c(1, 1, 1), mu=c(14, 15, 10))

# The functions are vectorized:
plot(0:200, dSnbinom(0:200, size=alpha, prob=p, method="furman"), bty="n", type="h", 
     xlab="x", ylab="Density")
points(0:200, dSnbinom(0:200, size=alpha, prob=p, method="saddlepoint"), 
     col="red", pch=3)
     
# Comparison with simulated distribution using rep replicates
alpha <- c(2.1, 2.05, 2)
mu <- c(10, 30, 20)
rep <- 100000
distEmpirique <- rSnbinom(rep, size=alpha, mu=mu)
tabledistEmpirique <- rep(0, 301)
names(tabledistEmpirique) <- as.character(0:300)
tabledistEmpirique[names(table(distEmpirique))] <- table(distEmpirique)/rep

plot(0:300, dSnbinom(0:300, size=alpha, mu=mu, method="furman"), type="h", bty="n", 
   xlab="x", ylab="Density", ylim=c(0,0.02))
plot_add(0:(length(tabledistEmpirique)-1), tabledistEmpirique, type="l", col="red")
legend(x=200, y=0.02, legend=c("Empirical", "Theoretical"), 
   text.col=c("red", "black"), bty="n")


# Example from Vellaisamy, P. & Upadhye, N.S. (2009) - Table 1
# Note that computing time for k = 7 using exact method is very long
k <- 2:7
x <- c(3, 5, 8, 10, 15)
table1_Vellaisamy <- matrix(NA, ncol=length(x), nrow=length(k))
rownames(table1_Vellaisamy) <- paste0("n = ", as.character(k))
colnames(table1_Vellaisamy) <- paste0("x = ", as.character(x))
table1_approximateObservations <- table1_Vellaisamy
table1_Furman3 <- table1_Vellaisamy
table1_Furman6 <- table1_Vellaisamy
table1_Furman9 <- table1_Vellaisamy
table1_Furman12 <- table1_Vellaisamy
table1_Furman40 <- table1_Vellaisamy
table1_Furman40 <- table1_Vellaisamy
table1_FurmanAuto <- table1_Vellaisamy
table1_FurmanAuto_iter <- table1_Vellaisamy
table1_Vellaisamy_parallel <- table1_Vellaisamy
table1_Approximate_Normal <- table1_Vellaisamy
table1_saddlepoint <- table1_Vellaisamy

st_Furman3 <- rep(NA, length(k))
st_Furman6 <- rep(NA, length(k))
st_Furman9 <- rep(NA, length(k))
st_Furman12 <- rep(NA, length(k))
st_Furman40 <- rep(NA, length(k))
st_FurmanAuto <- rep(NA, length(k))
st_approximateObservations <- rep(NA, length(k))
st_Vellaisamy <- rep(NA, length(k))
st_Vellaisamy_parallel <- rep(NA, length(k))
st_Approximate_Normal <- rep(NA, length(k))
st_saddlepoint <- rep(NA, length(k))

for (n in k) {
    print(n)
    alpha <- 1:n
    p <- (1:n)/10
    st_Vellaisamy[which(n == k)] <- 
        system.time({
        table1_Vellaisamy[which(n == k), ] <- dSnbinom(x=x, prob=p, size=alpha, 
                                     method="Vellaisamy&Upadhye", log=FALSE, verbose=FALSE)
        })[1]
    st_Vellaisamy_parallel[which(n == k)] <- 
        system.time({
        table1_Vellaisamy_parallel[which(n == k), ] <- dSnbinom(x=x, prob=p, size=alpha, 
                                     parallel=TRUE, 
                                     method="Vellaisamy&Upadhye", log=FALSE, verbose=FALSE)
        })[1]
    st_approximateObservations[which(n == k)] <- 
        system.time({
        table1_approximateObservations[which(n == k), ] <- dSnbinom(x=x, prob=p, size=alpha, 
                                     method="approximate.RandomObservations", log=FALSE, 
                                     verbose=FALSE)
        })[1]
    st_Furman3[which(n == k)] <- 
        system.time({
            table1_Furman3[which(n == k), ] <- dSnbinom(x=x, prob=p, size=alpha, 
                                     method="Furman", tol=1E-3, log=FALSE, 
                                     verbose=FALSE)
        })[1]
    st_Furman6[which(n == k)] <- 
        system.time({
            table1_Furman6[which(n == k), ] <- dSnbinom(x=x, prob=p, size=alpha, 
                                     method="Furman", tol=1E-6, log=FALSE, 
                                     verbose=FALSE)
        })[1]
    st_Furman9[which(n == k)] <- 
        system.time({
            table1_Furman9[which(n == k), ] <- dSnbinom(x=x, prob=p, size=alpha, 
                                     method="Furman", tol=1E-9, log=FALSE, 
                                     verbose=FALSE)
        })[1]
    st_Furman12[which(n == k)] <- 
        system.time({
            table1_Furman12[which(n == k), ] <- dSnbinom(x=x, prob=p, size=alpha, 
                                     method="Furman", tol=1E-12, log=FALSE, 
                                     verbose=FALSE)
        })[1]
    st_Furman40[which(n == k)] <- 
        system.time({
            table1_Furman40[which(n == k), ] <- dSnbinom(x=x, prob=p, size=alpha, 
                                     method="Furman", tol=1E-40, log=FALSE, 
                                     verbose=FALSE)
        })[1]

    st_FurmanAuto[which(n == k)] <- 
        system.time({
            table1_FurmanAuto[which(n == k), ] <- dSnbinom(x=x, prob=p, size=alpha, 
                                     method="Furman", tol=NULL, log=FALSE, 
                                     verbose=FALSE)
        })[1]

    st_Approximate_Normal[which(n == k)] <- 
        system.time({
            table1_Approximate_Normal[which(n == k), ] <- dSnbinom(x=x, prob=p, size=alpha, 
                                     method="approximate.normal", tol=1E-12, log=FALSE, 
                                     verbose=FALSE)
        })[1]
        st_saddlepoint[which(n == k)] <- 
        system.time({
            table1_saddlepoint[which(n == k), ] <- dSnbinom(x=x, prob=p, size=alpha, 
                                     method="saddlepoint", tol=1E-12, log=FALSE, 
                                     verbose=FALSE)
        })[1]

for (xc in x) {
 essai <- dSnbinom(x=xc, prob=p, size=alpha, method="Furman", tol=NULL, log=FALSE, verbose=TRUE)
 table1_FurmanAuto_iter[which(n == k), which(xc == x)] <- nrow(attributes(essai)[[1]])
}
}

cbind(table1_Vellaisamy, st_Vellaisamy)
cbind(table1_Vellaisamy_parallel, st_Vellaisamy_parallel)
cbind(table1_Furman3, st_Furman3)
cbind(table1_Furman6, st_Furman6)
cbind(table1_Furman9, st_Furman9)
cbind(table1_Furman12, st_Furman12)
cbind(table1_Furman40, st_Furman40)
cbind(table1_FurmanAuto, st_FurmanAuto)
cbind(table1_approximateObservations, st_approximateObservations)
cbind(table1_Approximate_Normal, st_Approximate_Normal)
cbind(table1_saddlepoint, st_saddlepoint)


# Test of different methods
n <- 9
x <- 17
alpha <- 1:n
p <- (1:n)/10

# Parallel computing is not always performant
# Here it is very performant
system.time({print(dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", log=FALSE, 
         verbose=TRUE, parallel=TRUE))})
system.time({print(dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", log=FALSE, 
         verbose=TRUE, parallel=FALSE))})
         
# Test of different methods
n <- 7
x <- 8
alpha <- 1:n
p <- (1:n)/10

# Parallel computing is not always performant
# Here it is approximately the same time of execution
system.time({print(dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", log=FALSE, 
         verbose=TRUE, parallel=TRUE))})
system.time({print(dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", log=FALSE, 
         verbose=TRUE, parallel=FALSE))})
         
# Test of different methods
n <- 7
x <- 15
alpha <- 1:n
p <- (1:n)/10

# Parallel computing is sometimes very performant
# Here parallel computing is 7 times faster (with a 8 cores computer) 
#             for vellaisamy&upadhye method
system.time(dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", log=FALSE, 
         verbose=TRUE, parallel=TRUE))
system.time(dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", log=FALSE, 
         verbose=TRUE, parallel=FALSE))
         
# Test of different methods
n <- 2
x <- 3
alpha <- 1:n
p <- (1:n)/10

# Parallel computing is sometimes very performant
# Here parallel computing is 7 times faster (with a 8 cores computer) 
#             for vellaisamy&upadhye method
system.time(dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", log=FALSE, 
         verbose=TRUE, parallel=TRUE))
system.time(dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", log=FALSE, 
         verbose=TRUE, parallel=FALSE))
         
# Test for different tolerant values
n <- 7
x <- 8
alpha <- 1:n
p <- (1:n)/10
dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", log=FALSE, verbose=TRUE)
as.numeric(dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, tol=1E-3, verbose=TRUE))
as.numeric(dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, tol=1E-6, verbose=TRUE))
as.numeric(dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, tol=1E-9, verbose=TRUE))
as.numeric(dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, verbose=TRUE))
as.numeric(dSnbinom(x=x, prob=p, size=alpha, method="Saddlepoint", log=FALSE, verbose=TRUE))
as.numeric(dSnbinom(x=x, prob=p, size=alpha, method="approximate.RandomObservations", 
         log=FALSE, verbose=TRUE))

# Test for criteria of convergence
Pr_exact <- dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", 
                   log=FALSE, verbose=TRUE)
Pr_Furman <- dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, 
                 verbose=TRUE)
Pr_exact;as.numeric(Pr_Furman)
plot(1:length(attributes(Pr_Furman)$Pk), 
     log10(abs(attributes(Pr_Furman)$Pk-Pr_exact)), type="l", xlab="Iterations", 
     ylab="Abs log10", bty="n")
lines(1:(length(attributes(Pr_Furman)$Pk)-1), 
      log10(abs(diff(attributes(Pr_Furman)$Pk))), col="red")
legend("bottomleft", legend=c("Log10 Convergence to true value", "Log10 Rate of change"), 
       col=c("black", "red"), 
       lty=1)
       
n <- 7
x <- 6
alpha <- 1:n
p <- (1:n)/10
Pr_exact <- dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", 
                   log=FALSE, verbose=TRUE)
Pr_Furman <- dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, 
                 verbose=TRUE)
Pr_saddlepoint <- dSnbinom(x=x, prob=p, size=alpha, method="saddlepoint", log=FALSE,  
                 verbose=TRUE)                  
                 
# pdf("figure.pdf", width=7, height=7, pointsize=14)                 
ylab <- as.expression(bquote(.("P(S=6) x 10")^"6"))
layout(1:2)
par(mar=c(3, 4, 1, 1))
plot(1:length(attributes(Pr_Furman)$Pk), 
     attributes(Pr_Furman)$Pk*1E6, type="l", xlab="", 
     ylab="", bty="n", las=1, xlim=c(0, 80))
mtext(text=ylab, side=2, line=2.5)
segments(x0=25, x1=80, y0=Pr_exact*1E6, y1=Pr_exact*1E6, lty=3)
par(xpd=TRUE)
text(x=0, y=6, labels="Exact probability", pos=4)
txt <- "        Approximate probability\n    based on Furman (2007)\nrecursive iterations"
text(x=20, y=2, labels=txt, pos=4)
text(x=75, y=5, labels="A", cex=2)
par(mar=c(4, 4, 1, 1))
ylab <- as.expression(bquote("log"["10"]*""*"(P"["k+1"]*""*" - P"["k"]*""*")"))
plot(1:(length(attributes(Pr_Furman)$Pk)-1), 
      log10(diff(attributes(Pr_Furman)$Pk)), col="black", xlim=c(0, 80), type ="l", 
      bty="n", las=1, xlab="Iterations", ylab="")
mtext(text=ylab, side=2, line=2.5)
 peak <- (1:(length(attributes(Pr_Furman)$Pk)-1))[which.max(
            log10(abs(diff(attributes(Pr_Furman)$Pk))))]
segments(x0=peak, x1=peak, y0=-12, y1=-5, lty=2)
text(x=0, y=-6, labels="Positive trend", pos=4)
text(x=30, y=-6, labels="Negative trend", pos=4)
segments(x0=0, x1=43, y0=-12, y1=-12, lty=4)
segments(x0=62, x1=80, y0=-12, y1=-12, lty=4)
text(x=45, y=-12, labels="Tolerance", pos=4)
text(x=75, y=-7, labels="B", cex=2)
# dev.off()

# pdf("figure 2.pdf", width=7, height=7, pointsize=14)                 
ylab <- as.expression(bquote(.("P(S=6) x 10")^"6"))
layout(1:2)
par(mar=c(3, 4, 1, 1))
plot(1:length(attributes(Pr_Furman)$Pk), 
     attributes(Pr_Furman)$Pk*1E6, type="l", xlab="", 
     ylab="", bty="n", las=1, xlim=c(0, 80))
mtext(text=ylab, side=2, line=2.5)
segments(x0=25, x1=80, y0=Pr_exact*1E6, y1=Pr_exact*1E6, lty=3)
par(xpd=TRUE)
text(x=0, y=6, labels="Exact probability", pos=4)
txt <- "        Approximate probability\n    based on Furman (2007)\nrecursive iterations"
text(x=20, y=2, labels=txt, pos=4)
text(x=75, y=5, labels="A", cex=2)
par(mar=c(4, 4, 1, 1))
ylab <- as.expression(bquote("(P"["k+1"]*""*" - P"["k"]*""*") x 10"^"7"))
plot(1:(length(attributes(Pr_Furman)$Pk)-1), 
      diff(attributes(Pr_Furman)$Pk)*1E7, 
      col="black", xlim=c(0, 80), type ="l", 
      bty="n", las=1, xlab="Iterations", ylab="")
mtext(text=ylab, side=2, line=2.5)
 peak <- (1:(length(attributes(Pr_Furman)$Pk)-1))[which.max(diff(attributes(Pr_Furman)$Pk))]
segments(x0=peak, x1=peak, y0=0, y1=3.5, lty=2)
text(x=-2, y=3.5, labels="Positive trend", pos=4)
text(x=30, y=3.5, labels="Negative trend", pos=4)
segments(x0=0, x1=22, y0=1E-12, y1=1E-12, lty=4)
segments(x0=40, x1=80, y0=1E-12, y1=1E-12, lty=4)
text(x=22, y=1E-12+0.2, labels="Tolerance", pos=4)
text(x=75, y=3, labels="B", cex=2)
# dev.off()

# Test of different methods
n <- 2
x <- 15
alpha <- 1:n
p <- (1:n)/10
dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", log=FALSE, verbose=TRUE)
as.numeric(dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, tol=1E-3, verbose=TRUE))
as.numeric(dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, tol=1E-6, verbose=TRUE))
as.numeric(dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, tol=1E-9, verbose=TRUE))
as.numeric(dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, verbose=TRUE))
dSnbinom(x=x, prob=p, size=alpha, method="approximate.RandomObservations", 
         log=FALSE, verbose=TRUE)


n <- 50
x <- 300
alpha <- (1:n)/100
p <- (1:n)/1000
# Produce an error
Pr_exact <- dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", 
                   log=FALSE, verbose=TRUE)
Pr_Furman <- dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, 
                 verbose=FALSE)
Pr_ApproximateNormal <- dSnbinom(x=x, prob=p, size=alpha, method="approximate.normal", 
                        log=FALSE, 
                        verbose=TRUE)
Pr_ApproximateRandom <- dSnbinom(x=x, prob=p, size=alpha, method="approximate.RandomObservations", 
                        log=FALSE, n.random=1E6, 
                        verbose=TRUE)
                        
n <- 500
x <- 3000
alpha <- (1:n)/100
p <- (1:n)/1000
# Produce an error
Pr_exact <- dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", 
                   log=FALSE, verbose=TRUE)
Pr_Furman <- dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, 
                 verbose=FALSE)
Pr_ApproximateNormal <- dSnbinom(x=x, prob=p, size=alpha, method="approximate.normal", 
                        log=FALSE, 
                        verbose=TRUE)
Pr_ApproximateNegativeBinomial <- dSnbinom(x=x, prob=p, size=alpha, 
                        method="approximate.negativebinomial", 
                        log=FALSE, 
                        verbose=TRUE)
Pr_ApproximateRandom <- dSnbinom(x=x, prob=p, size=alpha, 
                        method="approximate.RandomObservations", 
                        log=FALSE, n.random=1E6, 
                        verbose=TRUE)
Pr_ApproximateSaddlepoint <- dSnbinom(x=x, prob=p, size=alpha, 
                        method="saddlepoint", 
                        log=FALSE, 
                        verbose=TRUE)
             
             
             
layout(matrix(1:4, ncol=2, byrow=TRUE))
par(mar=c(3, 4.5, 1, 1))
alpha <- seq(from=10, to=100, length.out=3)
p <- seq(from=0.5, to=0.9, length.out=3)

p_nb <- dSnbinom(0:100, prob=p, size=alpha, method="vellaisamy&upadhye", verbose=TRUE)
p_Furman <- dSnbinom(0:100, prob=p, size=alpha, method="Furman", verbose=FALSE)
p_normal <- dSnbinom(0:100, prob=p, size=alpha, method="approximate.normal", verbose=TRUE)
p_aNB <- dSnbinom(0:100, prob=p, size=alpha, method="approximate.negativebinomial", verbose=TRUE)
p_SA <- dSnbinom(0:100, prob=p, size=alpha, method="saddlepoint", verbose=TRUE)

lab_PSnx <- bquote(italic("P(S"* "" [n] * "=x)"))

plot(1, 1, las=1, bty="n", col="grey", xlab="", 
       xlim=c(10, 70), ylim=c(0, 0.05), 
       ylab=lab_PSnx, type="n")
par(xpd=FALSE)
segments(x0=(0:100), x1=(0:100), 
         y0=0, y1=as.numeric(p_nb), col="black")
         
plot(x=p_nb, y=p_normal, pch=4, cex=0.5, las=1, bty="n", 
     xlab=bquote(italic("P" * "" [exact] * "(S"* "" [n] * "=x)")), 
     ylab=bquote(italic("P" * "" [approximate] * "(S"* "" [n] * "=x)")),  
     xlim=c(0, 0.05), ylim=c(0, 0.05))
points(x=p_nb, y=p_aNB, pch=5, cex=0.5)
points(x=p_nb, y=p_SA, pch=6, cex=0.5)
points(x=p_Furman, y=p_SA, pch=19, cex=0.5)

n <- 2
x <- 15
alpha <- 1:n
p <- (1:n)/10

p_nb <- dSnbinom(0:80, prob=p, size=alpha, method="vellaisamy&upadhye", verbose=TRUE)
p_Furman <- dSnbinom(0:80, prob=p, size=alpha, method="Furman", verbose=FALSE)
p_normal <- dSnbinom(0:80, prob=p, size=alpha, method="approximate.normal", verbose=TRUE)
p_aNB <- dSnbinom(0:80, prob=p, size=alpha, method="approximate.negativebinomial", verbose=TRUE)
p_SA <- dSnbinom(0:80, prob=p, size=alpha, method="saddlepoint", verbose=TRUE)


par(mar=c(4, 4.5, 1, 1))
plot(1, 1, las=1, bty="n", col="grey", xlab="x", 
       xlim=c(0, 60), ylim=c(0, 0.05), 
       ylab=lab_PSnx, type="n")
par(xpd=FALSE)
segments(x0=(0:80), x1=(0:80), 
         y0=0, y1=as.numeric(p_nb), col="black")
         
plot(x=p_nb, y=p_normal, pch=4, cex=0.5, las=1, bty="n", 
     xlab=bquote(italic("P" * "" [exact] * "(S"* "" [n] * "=x)")), 
     ylab=bquote(italic("P" * "" [approximate] * "(S"* "" [n] * "=x)")), 
     xlim=c(0, 0.05), ylim=c(0, 0.05))
points(x=p_nb, y=p_aNB, pch=5, cex=0.5)
points(x=p_nb, y=p_SA, pch=6, cex=0.5)
points(x=p_Furman, y=p_SA, pch=19, cex=0.5)

# pdf("figure 1.pdf", width=7, height=7, pointsize=14)    

layout(1:2)
par(mar=c(3, 4, 1, 1))
alpha <- seq(from=10, to=100, length.out=3)
p <- seq(from=0.5, to=0.9, length.out=3)

p_nb <- dSnbinom(0:100, prob=p, size=alpha, method="vellaisamy&upadhye", verbose=TRUE)
p_Furman <- dSnbinom(0:100, prob=p, size=alpha, method="Furman", verbose=FALSE)
p_normal <- dSnbinom(0:100, prob=p, size=alpha, method="approximate.normal", verbose=TRUE)
p_aNB <- dSnbinom(0:100, prob=p, size=alpha, method="approximate.negativebinomial", verbose=TRUE)
p_SA <- dSnbinom(0:100, prob=p, size=alpha, method="saddlepoint", verbose=TRUE)

lab_PSnx <- bquote(italic("P(S"* "" [n] * "=x)"))

plot(1, 1, las=1, bty="n", col="grey", xlab="", 
       xlim=c(10, 70), ylim=c(0, 0.09), 
       ylab="", type="n", yaxt="n")
axis(2, at=seq(from=0, to=0.05, by=0.01), las=1)
mtext(lab_PSnx, side = 2, adj=0.3, line=3)
par(xpd=FALSE)
segments(x0=(0:100), x1=(0:100), 
         y0=0, y1=as.numeric(p_nb), col="black")
errr <- (abs((100*(p_Furman-p_nb)/p_nb)))/1000+0.055
errr <- ifelse(is.infinite(errr), NA, errr)
lines(x=(0:100), y=errr, lty=5, col="red", lwd=2)
errr <- (abs((100*(p_normal-p_nb)/p_nb)))/1000+0.055
lines(x=(0:100), y=errr, lty=2, col="blue", lwd=2)
errr <- (abs((100*(p_aNB-p_nb)/p_nb)))/1000+0.055
lines(x=(0:100), y=errr, lty=3, col="purple", lwd=2)
errr <- (abs((100*(p_SA-p_nb)/p_nb)))/1000+0.055
lines(x=(0:100), y=errr, lty=4, col="green", lwd=2)
axis(2, at=seq(from=0, to=40, by=10)/1000+0.055, las=1, 
     labels=as.character(seq(from=0, to=40, by=10)))
mtext("|% error|", side = 2, adj=0.9, line=3)
par(xpd=TRUE)
legend(x=30, y=0.1, legend=c("Inversion of mgf", "Saddlepoint", "Normal", "Negative binomial"), 
       lty=c(5, 4, 2, 3), bty="n", cex=0.8, col=c("red", "green", "blue", "purple"), lwd=2)
legend(x=10, y=0.05, legend=c("Exact"), lty=c(1), bty="n", cex=0.8)
par(xpd=TRUE)
text(x=ScalePreviousPlot(x = 0.95, y = 0.1)$x, 
     y=ScalePreviousPlot(x = 0.95, y = 0.1)$y, labels="A", cex=2)
     
     
# When normal approximation will fail
n <- 2
x <- 15
alpha <- 1:n
p <- (1:n)/10

p_nb <- dSnbinom(0:80, prob=p, size=alpha, method="vellaisamy&upadhye", verbose=TRUE)
p_Furman <- dSnbinom(0:80, prob=p, size=alpha, method="Furman", verbose=FALSE)
p_normal <- dSnbinom(0:80, prob=p, size=alpha, method="approximate.normal", verbose=TRUE)
p_aNB <- dSnbinom(0:80, prob=p, size=alpha, method="approximate.negativebinomial", verbose=TRUE)
p_SA <- dSnbinom(0:80, prob=p, size=alpha, method="saddlepoint", verbose=TRUE)

par(mar=c(4, 4, 1, 1))
plot(1, 1, las=1, bty="n", col="grey", xlab="x", 
       xlim=c(0, 60), ylim=c(0, 0.09), 
       ylab="", type="n", yaxt="n")
axis(2, at=seq(from=0, to=0.05, by=0.01), las=1)
mtext(lab_PSnx, side = 2, adj=0.3, line=3)
par(xpd=FALSE)
segments(x0=(0:80), x1=(0:80), 
         y0=0, y1=as.numeric(p_nb), col="black")
errr <- (abs((100*(p_Furman-p_nb)/p_nb)))/1000+0.055
errr <- ifelse(is.infinite(errr), NA, errr)
lines(x=(0:80), y=errr, lty=5, col="red", lwd=2)
errr <- (abs((100*(p_normal-p_nb)/p_nb)))/1000+0.055
lines(x=(0:80), y=errr, lty=2, col="blue", lwd=2)
errr <- (abs((100*(p_aNB-p_nb)/p_nb)))/1000+0.055
lines(x=(0:80), y=errr, lty=3, col="purple", lwd=2)
errr <- (abs((100*(p_SA-p_nb)/p_nb)))/1000+0.055
lines(x=(0:80), y=errr, lty=4, col="green", lwd=2)
axis(2, at=seq(from=0, to=40, by=10)/1000+0.055, las=1, 
     labels=as.character(seq(from=0, to=40, by=10)))
mtext("|% error|", side = 2, adj=0.9, line=3)
legend(x=30, y=0.055, 
       legend=c("Exact", "Inversion of mgf", "Saddlepoint", "Normal", "Negative binomial"), 
       lty=c(1, 5, 4, 2, 3), bty="n", cex=0.8, col=c("black", "red", "green", "blue", "purple"), 
       lwd=c(1, 2, 2, 2, 2))
par(xpd=TRUE)
text(x=ScalePreviousPlot(x = 0.95, y = 0.1)$x, 
     y=ScalePreviousPlot(x = 0.95, y = 0.1)$y, labels="B", cex=2)
     

# dev.off()


# Test for criteria of convergence
Pr_exact <- dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", 
                   log=FALSE, verbose=TRUE)
Pr_Furman <- dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, 
                 verbose=TRUE)
Pr_exact;as.numeric(Pr_Furman)
plot(1:length(attributes(Pr_Furman)$Pk), 
     log10(abs(attributes(Pr_Furman)$Pk-Pr_exact)), type="l", xlab="Iterations", 
     ylab="Abs log10", bty="n")
lines(1:(length(attributes(Pr_Furman)$Pk)-1), 
      log10(abs(diff(attributes(Pr_Furman)$Pk))), col="red")
legend("bottomleft", legend=c("Log10 Convergence to true value", "Log10 Rate of change"), 
       col=c("black", "red"), 
       lty=1)


# Test of different methods
alpha <- c(2.05, 2)
mu <- c(10, 30)
test <- rSnbinom(n=100000, size=alpha, mu=mu)
plot(0:200, table(test)[as.character(0:200)]/sum(table(test), na.rm=TRUE), 
     bty="n", type="h", xlab="x", ylab="Density")
lines(x=0:200, dSnbinom(0:200, size=alpha, mu=mu, log=FALSE, method="Furman"), col="blue")
lines(x=0:200, y=dSnbinom(0:200, size=alpha, mu=mu, log=FALSE, 
                          method="vellaisamy&upadhye"), col="red")
lines(x=0:200, y=dSnbinom(0:200, size=alpha, mu=mu, log=FALSE, 
                          method="approximate.randomobservations"), col="green")

# Test for criteria of convergence for x = 50
x <- 50
# Test for criteria of convergence
Pr_exact <- dSnbinom(x=x, prob=p, size=alpha, method="vellaisamy&upadhye", 
                   log=FALSE, verbose=TRUE)
Pr_Furman <- dSnbinom(x=x, prob=p, size=alpha, method="Furman", log=FALSE, tol=1E-12, 
                 verbose=TRUE)
Pr_exact;as.numeric(Pr_Furman)
plot(1:length(attributes(Pr_Furman)$Pk), 
     log10(abs(attributes(Pr_Furman)$Pk-Pr_exact)), type="l", xlab="Iterations", 
     ylab="Abs log10", bty="n")
lines(1:(length(attributes(Pr_Furman)$Pk)-1), 
      log10(abs(diff(attributes(Pr_Furman)$Pk))), col="red")
legend("bottomleft", legend=c("Log10 Convergence to true value", "Log10 Rate of change"), 
       col=c("black", "red"), 
       lty=1)

# Another example more complicated
set.seed(2)
mutest <- c(56, 6.75, 1)
ktest <- c(50, 50, 50)
nr <- 100000
test <- rSnbinom(nr, size=ktest, mu=mutest)
system.time({pr_vellaisamy <- dSnbinom(x=0:150, size=ktest, mu=mutest, 
                          method = "vellaisamy&upadhye", verbose=FALSE, parallel=FALSE)})
# Parallel computing is not efficient
system.time({pr_vellaisamy <- dSnbinom(x=0:150, size=ktest, mu=mutest, 
                          method = "vellaisamy&upadhye", verbose=FALSE, parallel=TRUE)})
system.time({pr_furman <- dSnbinom(x=0:150, size=ktest, mu=mutest, prob=NULL, 
                      method = "furman", verbose=FALSE, log=FALSE)})
pr_approximateObservations <- dSnbinom(0:150, size=ktest, mu=mutest, 
                                       method = "approximate.randomobservations")

plot(table(test), xlab="N", ylab="Density", las=1, bty="n", ylim=c(0, 4000), xlim=c(0, 150))
lines(0:150, pr_vellaisamy*nr, col="red")
lines(0:150, pr_furman*nr, col="blue")
lines(0:150, pr_approximateObservations*nr, col="green")

dSnbinom(x=42, size=ktest, mu=mutest, prob=NULL, 
         method = "vellaisamy&upadhye", verbose=TRUE)
as.numeric(dSnbinom(x=42, size=ktest, mu=mutest, prob=NULL, 
           method = "Furman", verbose=TRUE))
dSnbinom(x=42, size=ktest, mu=mutest, prob=NULL, 
         method = "approximate.randomobservations", verbose=TRUE)

x <- 100
# Test for criteria of convergence
Pr_exact <- dSnbinom(x=x, size=ktest, mu=mutest, method="vellaisamy&upadhye", 
                   log=FALSE, verbose=TRUE)
Pr_Furman <- dSnbinom(x=x, size=ktest, mu=mutest, method="Furman", log=FALSE, 
                 verbose=TRUE)
Pr_exact;as.numeric(Pr_Furman)
plot(1:length(attributes(Pr_Furman)$Pk), 
     log10(abs(attributes(Pr_Furman)$Pk-Pr_exact)), type="l", xlab="Iterations", 
     ylab="Abs log10", bty="n", ylim=c(-100, 0))
lines(1:(length(attributes(Pr_Furman)$Pk)-1), 
      log10(abs(diff(attributes(Pr_Furman)$Pk))), col="red")
legend("bottomright", legend=c("Log10 Convergence to true value", "Log10 Rate of change"), 
       col=c("black", "red"), 
       lty=1)
       
     
# example to fit a distribution
data <- rnbinom(1000, size=1, mu=10)
hist(data)
ag <- rep(1:100, 10)
r <- aggregate(data, by=list(ag), FUN=sum)
hist(r[,2])

parx <- c(size=1, mu=10)

dSnbinomx <- function(x, par) {
  -sum(dSnbinom(x=x[,2], mu=rep(par["mu"], 10), size=par["size"], log=TRUE))
}

fit_mu_size <- optim(par = parx, fn=dSnbinomx, x=r, method="BFGS", control=c(trace=TRUE))
fit_mu_size$par

alpha <- c(2.1, 2.05, 2)
mu <- c(10, 30, 20)
p <- pSnbinom(q=10, size=alpha, mu=mu, lower.tail = TRUE)

alpha <- c(2.1, 2.05, 2)
mu <- c(10, 30, 20)
q <- qSnbinom(p=0.1, size=alpha, mu=mu, lower.tail = TRUE)

alpha <- c(2.1, 2.05, 2)
mu <- c(10, 30, 20)
rep <- 100000
distEmpirique <- rSnbinom(n=rep, size=alpha, mu=mu)
tabledistEmpirique <- rep(0, 301)
names(tabledistEmpirique) <- as.character(0:300)
tabledistEmpirique[names(table(distEmpirique))] <- table(distEmpirique)/rep

plot(0:300, dSnbinom(0:300, size=alpha, mu=mu), type="h", bty="n", 
   xlab="x", ylab="Density", ylim=c(0,0.02))
plot_add(0:300, tabledistEmpirique, type="l", col="red")
legend(x=200, y=0.02, legend=c("Empirical", "Theoretical"), 
   text.col=c("red", "black"), bty="n")
   
   
# Test if saddlepoint approximation must be normalized
# Yes, it must be
n <- 7
alpha <- 1:n
p <- (1:n)/10
dSnbinom(x=10, prob=p, size=alpha, method="saddlepoint", log=FALSE,  
                 verbose=TRUE)
dSnbinom(x=10, prob=p, size=alpha, method="saddlepoint", log=FALSE,  
                 verbose=TRUE, normalize=FALSE)
                 
# Test for saddlepoint when x=0
n <- 7
alpha <- 1:n
p <- (1:n)/10
dSnbinom(x=0, prob=p, size=alpha, method="saddlepoint", log=FALSE,  
                 verbose=TRUE)
dSnbinom(x=1, prob=p, size=alpha, method="saddlepoint", log=FALSE,  
                 verbose=TRUE)
dSnbinom(x=c(0, 1), prob=p, size=alpha, method="saddlepoint", log=FALSE,  
                 verbose=TRUE)
dSnbinom(x=c(0, 1), prob=p, size=alpha, method="saddlepoint", log=FALSE,  
                 verbose=FALSE)
                 
# Test when prob are all the same
p <- rep(0.2, 7)
n <- 7
alpha <- 1:n
dSnbinom(x=0:10, prob=p, size=alpha, method="saddlepoint", log=FALSE,  
                 verbose=TRUE)
dSnbinom(x=0:10, prob=p, size=alpha, method="furman", log=FALSE,  
                 verbose=TRUE)
dSnbinom(x=0:10, prob=p, size=alpha, method="exact", log=FALSE,  
                 verbose=TRUE)
                 
# Test when n=1
p <- 0.2
n <- 1
alpha <- 1:n
dSnbinom(x=0:10, prob=p, size=alpha, method="saddlepoint", log=FALSE,  
                 verbose=TRUE)
dSnbinom(x=0:10, prob=p, size=alpha, method="furman", log=FALSE,  
                 verbose=TRUE)
dSnbinom(x=0:10, prob=p, size=alpha, method="exact", log=FALSE,  
                 verbose=TRUE)
                 
}

Run the code above in your browser using DataLab