For all functions except ltappareto
, arguments lambda
and theta
can either be scalars or vectors of the same length as x
, p
, or q
. If a scalar, then this value is assumed to hold over all cases. If a vector, then the values are assumed to have a one to one relationship with the values in x
, p
, or q
. The argument a
is a scalar.
In the case of ltappareto
, all data
are assumed to be drawn from the same distribution and hence lambda
, theta
and a
are all scalars.
Let \(Y\) be an exponential random variable with parameter \(\lambda > 0\). Then the distribution function of \(Y\) is
$$
F_Y(y) = \Pr\{Y < y \} = 1 - \exp(-\lambda y),
$$
and the density function is
$$
f_Y(y) = \lambda \exp(-\lambda y).
$$
Further, the mean and variance of the distribution of \(Y\) is \(1/\lambda\) and \(1/\lambda^2\), respectively.
Now transform \(Y\) as
$$
X = a \exp(Y),
$$
where \(a>0\). Then \(X\) is a Pareto random variable with shape parameter \(\lambda\) and distribution function
$$
F_X(x) = \Pr\{X < x \} = 1 - \left( \frac{a}{x} \right)^\lambda,
$$
where \(a \le x < \infty\), and density function
$$
f_X(x) = \frac{\lambda}{a} \left( \frac{a}{x} \right)^{\lambda+1}.
$$
We simulate the Pareto deviates by generating exponential deviates, and then transforming as described above.
As above, let \(X\) be Pareto with shape parameter \(\lambda\), and define \(W - a\) to be exponential with parameter \(1/\theta\), i.e.
$$
\Pr\{X > x\} = \left( \frac{a}{x} \right)^\lambda
$$
and
$$
\Pr\{W > w\} = \exp\left( \frac{a - w}{\theta} \right),
$$
where \(a \le w < \infty\). Say we sample one independent value from each of the distributions \(X\) and \(W\), then
$$
\Pr\{X > z\ \&\ W > z\} = \Pr\{X > z\} \Pr\{ W > z\} = \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).
$$
We say that \(Z\) has a tapered Pareto distribution if it has the above distribution, i.e.
$$
F_Z(z) = \Pr\{Z < z\} = 1- \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).
$$
The above relationship shows that a tapered Pareto deviate can be simulated by generating independent values of \(X\) and \(W\), and then letting \(Z = \min(X, W)\). This minimum has the effect of “tapering” the tail of the Pareto distribution.
The tapered Pareto variable \(Z\) has density
$$
f_Z(z) = \left( \frac{\lambda}{z} + \frac{1}{\theta} \right) \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).
$$
Given a sample of data \(z_1, z_2, \cdots, z_n\), we write the log-likelihood as
$$
\log L = \sum_{i=1}^n \log f_Z(z_i).
$$
Hence the gradients are calculated as
$$
\frac{\partial \log L}{\partial \lambda} = \theta \sum_{i=1}^n \frac{1}{\lambda \theta + z_i} - \sum_{i=1}^n \log(z_i/a)
$$
and
$$
\frac{\partial \log L}{\partial \theta} = \frac{-1}{\theta} \sum_{i=1}^n \frac{z_i}{\lambda \theta + z_i} - \frac{1}{\theta^2} \sum_{i=1}^n (a - z_i).
$$
Further, the Hessian is calculated using
$$
\frac{\partial^2 \log L}{\partial \lambda^2} = -\theta^2 \sum_{i=1}^n \frac{1}{(\lambda \theta + z_i)^2},
$$
$$
\frac{\partial^2 \log L}{\partial \theta^2} = \frac{1}{\theta^2} \sum_{i=1}^n \frac{z_i(2\lambda\theta + z_i)}{(\lambda \theta + z_i)^2} - \frac{2}{\theta^3} \sum_{i=1}^n (a - z_i),
$$
and
$$
\frac{\partial^2 \log L}{\partial \theta \, \partial \lambda} = \frac{\partial^2 \log L}{\partial \lambda \, \partial \theta} = \sum_{i=1}^n \frac{z_i}{(\lambda \theta + z_i)^2}.
$$
See the section “Seismological Context” (below), which outlines its application in Seismology.