pareto1: Pareto and Truncated Pareto Distribution Family Functions

Description

Estimates one of the parameters of the Pareto(I) distribution by maximum likelihood estimation. Also includes the upper truncated Pareto(I) distribution.

Usage

pareto1(lshape = "loge", earg=list(), location=NULL)
tpareto1(lower, upper, lshape = "loge", earg=list(), ishape=NULL,
         method.init=1)

Arguments

lshape

Parameter link function applied to the parameter $k$. See Links for more choices. A log link is the default because $k$ is positive.

earg

List. Extra argument for the link. See earg in Links for general information.

lower, upper

Numeric. Lower and upper limits for the truncated Pareto distribution. Each must be positive and of length 1. They are called $\alpha$ and $U$ below.

ishape

Numeric. Optional initial value for the shape parameter. A NULL means a value is obtained internally. If failure to converge occurs try specifying a value, e.g., 1 or 2.

location

Numeric. The parameter $\alpha$ below. If the user inputs a number then it is assumed known with this value. The default means it is estimated by maximum likelihood estimation, which means min(y) where y is the response v

method.init

An integer with value 1 or 2 which specifies the initialization method. If failure to converge occurs try the other value, or else specify a value for ishape.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

Warning

The usual or unbounded Pareto distribution has two parameters (called $\alpha$ and $k$ here) but the family function pareto1 estimates only $k$ using iteratively reweighted least squares. The MLE of the $\alpha$ parameter lies on the boundary and is min(y) where y is the response. Consequently, using the default argument values, the standard errors are incorrect when one does a summary on the fitted object. If the user inputs a value for alpha then it is assumed known with this value and then summary on the fitted object should be correct. Numerical problems may occur for small $k$, e.g., $k < 1$.

Details

A random variable $Y$ has a Pareto distribution if $$P[Y>y] = C / y^{k}$$ for some positive $k$ and $C$. This model is important in many applications due to the power law probability tail, especially for large values of $y$.

The Pareto distribution, which is used a lot in economics, has a probability density function that can be written $$f(y) = k \alpha^k / y^{k+1}$$ for $0 < \alpha < y$ and $k>0$. The $\alpha$ is known as the location parameter, and $k$ is known as the shape parameter. The mean of $Y$ is $\alpha k/(k-1)$ provided $k>1$. Its variance is $\alpha^2 k /((k-1)^2 (k-2))$ provided $k>2$.

The upper truncated Pareto distribution has a probability density function that can be written $$f(y) = k \alpha^k / [y^{k+1} (1-(\alpha/U)^k)]$$ for $0 < \alpha < y < U < \infty$ and $k>0$. Possibly, better names for $k$ are the index and tail parameters. Here, $\alpha$ and $U$ are known. The mean of $Y$ is $k \alpha^k (U^{1-k}-\alpha^{1-k}) / [(1-k)(1-(\alpha/U)^k)]$.

References

Evans, M., Hastings, N. and Peacock, B. (2000) Statistical Distributions, New York: Wiley-Interscience, Third edition.

Aban, I. B., Meerschaert, M. M. and Panorska, A. K. (2006) Parameter estimation for the truncated Pareto distribution, Journal of the American Statistical Association, 101(473), 270--277.

Examples

Run this code

alpha = 2; k = exp(3)
pdat = data.frame(y = rpareto(n=1000, location=alpha, shape=k))
fit = vglm(y ~ 1, pareto1, pdat, trace=TRUE)
fit@extra   # The estimate of alpha is here
head(fitted(fit))
with(pdat, mean(y))
coef(fit, matrix=TRUE)
summary(fit)     # Standard errors are incorrect!!

# Here, alpha is assumed known
fit2 = vglm(y ~ 1, pareto1(location=alpha), pdat, trace=TRUE, crit="c")
fit2@extra   # alpha stored here
head(fitted(fit2))
coef(fit2, matrix=TRUE)
summary(fit2)    # Standard errors are ok

# Upper truncated Pareto distribution
lower = 2; upper = 8; k = exp(2)
pdat3 = data.frame(y = rtpareto(n=100, lower=lower, upper=upper, shape=k))
fit3 = vglm(y ~ 1, tpareto1(lower, upper), pdat3, trace=TRUE, cri="c")
coef(fit3, matrix=TRUE)
c(fit3@misc$lower, fit3@misc$upper)

Run the code above in your browser using DataLab