icmaLogCon: Computes a Log-Concave Probability Density Estimate via an Iterative Convex Minorant Algorithm

Description

Given a vector of observations ${\bold{x}_n} = (x_1, \ldots, x_n)$ with not necessarily equal entries, activeSetLogCon first computes vectors ${\bold{x}_m} = (x_1, \ldots, x_m)$ and ${\bold{w}} = (w_1, \ldots, w_m)$ where $w_i$ is the weight of each $x_i$ s.t. $\sum_{i=1}^m w_i = 1$. Then, activeSetLogCon computes a concave, piecewise linear function $\widehat \phi_m$ on $[x_1, x_m]$ with knots only in $\{x_1, \ldots, x_m\}$ such that

$$L(\phi) = \sum_{i=1}^m w_i \phi(x_i) - \int_{-\infty}^\infty \exp(\phi(t)) dt$$

is maximal. In order to be able to apply the pool - adjacent - violaters algorithm, computations are performed in the parametrization

$${\bold{\eta}}({\bold{\phi}}) = \Bigl(\phi_1, \Bigl(\eta_1 + \sum_{j=2}^i (x_i-x_{i-1})\eta_i\Bigr)_{i=2}^m \Bigr).$$

To find the maximum of $L$, a variant of the iterative convex minorant using the pool - adjacent - violaters algorithm is used.

Usage

icmaLogCon(x, xgrid = NULL, eps = 10^-8, T1 = 2000, 
    robustif = TRUE, print = FALSE)

Value

x: Vector of observations $x_1, \ldots, x_m$ that was used to estimate the density.
w: The vector of weights that had been used. Depends on the chosen setting for xgrid.
f: Vector with entries $\widehat f_m(x_i).$
xn: Vector with initial observations $x_1, \ldots, x_n$.
Loglik: The value $L(\widehat \phi_m)$ of the log-likelihood-function $L$ at the maximum $\widehat \phi_m.$
Iterations: Number of iterations performed.
sig: The standard deviation of the initial sample $x_1, \ldots, x_n$.

Arguments

x: Vector of independent and identically distributed numbers, not necessarily equal.
xgrid: Governs the generation of weights for observations. See preProcess for details.
eps: An arbitrary real number, typically small. Iterations are halted if the directional derivative of ${\bold{\eta}} \to L({\bold{\eta}})$ in the direction of the new candidate is $\le \varepsilon$.
T1: Maximal number of iterations to perform.
robustif: robustif = TRUE performs the robustification and Hermite interpolation procedure detailed in Rufibach (2006, 2007), robustif = FALSE does not. In the latter case, convergence of the algorithm is no longer guaranteed.
print: print = TRUE outputs log-likelihood in every loop, print = FALSE does not. Make sure to tell R to output (press CTRL+W).

Author

Kaspar Rufibach, kaspar.rufibach@gmail.com,
http://www.kasparrufibach.ch

Lutz Duembgen, duembgen@stat.unibe.ch,
https://www.imsv.unibe.ch/about_us/staff/prof_dr_duembgen_lutz/index_eng.html

References

Rufibach K. (2006) Log-concave Density Estimation and Bump Hunting for i.i.d. Observations. PhD Thesis, University of Bern, Switzerland and Georg-August University of Goettingen, Germany, 2006.
Available at https://slsp-ube.primo.exlibrisgroup.com/permalink/41SLSP_UBE/17e6d97/alma99116730175505511.

Rufibach, K. (2007) Computing maximum likelihood estimators of a log-concave density function. J. Stat. Comput. Simul. 77, 561--574.

Examples

Run this code

set.seed(1977)
x <- rgamma(200, 2, 1)
if (FALSE) {
res <- icmaLogCon(x, T1 = 2000, robustif = TRUE, print = TRUE)

## plot resulting functions
par(mfrow = c(2, 1), mar = c(3, 2, 1, 2))
plot(x, exp(res$phi), type = 'l'); rug(x)
plot(x, res$phi, type = 'l'); rug(x)
}

Run the code above in your browser using DataLab