This function ZIPF()
defines the zipf distribution, Johnson et. al., (2005), sections 11.2.20, p 527-528. The zipf distribution is an one parameter distribution with long tails (a discete version of the Pareto distrbution). The function ZIPF()
creates a gamlss.family
object to be used in GAMLSS fitting. The functions dZIPF
, pZIPF
, qZIPF
and rZIPF
define the density, distribution function, quantile function and random generation for the zipf, ZIPF()
, distribution. The function zetaP()
defines the zeta function and it is based on the zeta function defined on the VGAM
package of Thomas Yee, see Yee (2017).
The distribution zipf is defined on \(y=1,2,3, \ldots,\infty\), the zero adjusted zipf permits values on \(y=,0 1,2, \ldots,\infty\). The function ZAZIPF()
defines the zero adjusted zipf distribution. The function ZAZIPF()
creates a gamlss.family
object to be used in GAMLSS fitting. The functions dZAZIPF
, pZAZIPF
, qZAZIPF
and rZAZIPF
define the density, distribution function, quantile function and random generation for the zero adjusted zipf, ZAZIPF()
, distribution.
ZIPF(mu.link = "log")
dZIPF(x, mu = 1, log = FALSE)
pZIPF(q, mu = 1, lower.tail = TRUE, log.p = FALSE)
qZIPF(p, mu = 1, lower.tail = TRUE, log.p = FALSE,
max.value = 10000)
rZIPF(n, mu = 1, max.value = 10000)
zetaP(x)
ZAZIPF(mu.link = "log", sigma.link = "logit")
dZAZIPF(x, mu = 0.5, sigma = 0.1, log = FALSE)
pZAZIPF(q, mu = 0.5, sigma = 0.1, lower.tail = TRUE,
log.p = FALSE)
qZAZIPF(p, mu = 0.5, sigma = 0.1, lower.tail = TRUE,
log.p = FALSE, max.value = 10000)
rZAZIPF(n, mu = 0.5, sigma = 0.1, max.value = 10000)
The function ZIPF()
returns a gamlss.family
object which can be used to fit a zipf distribution in the gamlss()
function.
the link function for the parameter mu
with default log
vectors of (non-negative integer) quantiles
vector of probabilities
vector of positive parameter
logical; if TRUE
, probabilities p
are given as log(p)
logical; if TRUE
(default), probabilities are P[X <= x]
, otherwise, P[X > x]
number of random values to return
a constant, set to the default value of 10000, It is used in the q
function which numerically calculates how far the algorithm should look for q. Maybe for zipf data the values has to increase at a considerable computational cost.
the link function for the parameter aigma
with default logit
a vector of probabilities of zero
Mikis Stasinopoulos and Bob Rigby
The probability density for the zipf distribution, ZIPF
, is:
$$f(y|\mu)=\frac{y^{-(\mu+1)}}{\zeta(\mu+1)}$$
for \(y=1,2,\ldots,\infty\), \(\mu>0\) and where
\(\zeta() = \sum_i^n i^{-b}\) is the (Reimann) zeta function.
The distribution has mean \(\zeta(\mu)/\zeta(\mu+1)\) and variance \({\zeta(\mu+1)\zeta(\mu-1)-[\zeta(\mu)]^2 }/ [\zeta(\mu+1)]^2\), see pp 479-480 of Rigby et al. (2019)
For more details about the zero-adjusted Zipf distributions, ZAZIPF
, see see pp 496-498 of Rigby et al. (2019).
N. L. Johnson, A. W. Kemp, and S. Kotz. (2005) Univariate Discrete Distributions. Wiley, New York, 3rd edition.
Thomas W. Yee (2017). VGAM: Vector Generalized Linear and Additive Models. R package version 1.0-3. https://CRAN.R-project.org/package=VGAM
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC, tools:::Rd_expr_doi("10.1201/9780429298547"). An older version can be found in https://www.gamlss.com/.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, tools:::Rd_expr_doi("10.18637/jss.v023.i07").
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC. tools:::Rd_expr_doi("10.1201/b21973")
(see also https://www.gamlss.com/).
PO
, LG
, GEOM
, YULE
# ZIPF
par(mfrow=c(2,2))
y<-seq(1,20,1)
plot(y, dZIPF(y), type="h")
q <- seq(1, 20, 1)
plot(q, pZIPF(q), type="h")
p<-seq(0.0001,0.999,0.05)
plot(p , qZIPF(p), type="s")
dat <- rZIPF(100)
hist(dat)
# ZAZIPF
y<-seq(0,20,1)
plot(y, dZAZIPF(y, mu=.9, sigma=.1), type="h")
q <- seq(1, 20, 1)
plot(q, pZAZIPF(q, mu=.9, sigma=.1), type="h")
p<-seq(0.0001,0.999,0.05)
plot(p, qZAZIPF(p, mu=.9, sigma=.1), type="s")
dat <- rZAZIPF(100, mu=.9, sigma=.1)
hist(dat)
Run the code above in your browser using DataLab