Hpi: Plug-in bandwidth selector

Description

Plug-in bandwidth for for 1- to 6-dimensional data.

Usage

Hpi(x, nstage=2, pilot="samse", pre="sphere", Hstart,
    binned=FALSE, bgridsize, amise=FALSE, kfold=1)
Hpi.diag(x, nstage=2, pilot="samse", pre="scale", Hstart,
    binned=FALSE, bgridsize, amise=FALSE, kfold=1)
hpi(x, nstage=2, binned=TRUE, bgridsize)

Arguments

vector or matrix of data values

nstage

number of stages in the plug-in bandwidth selector (1 or 2)

pilot

"amse" = AMSE pilot bandwidths, "samse" = single SAMSE pilot bandwidth, "unconstr" = unconstrained pilot bandwidth

pre

"scale" = pre-scaling, "sphere" = pre-sphering

Hstart

initial bandwidth matrix, used in numerical optimisation

binned

flag for binned kernel estimation. Default is FALSE.

bgridsize

vector of binning grid sizes

amise

flag to return the minimal scaled PI value

kfold

value for k-fold bandwidth selection. See details below.

Value

Plug-in bandwidth. If amise=TRUE then the minimal scaled PI value is returned too.

Details

hpi is the univariate plug-in selector of Wand & Jones (1994), i.e. it is exactly the same as KernSmooth's dpik. Hpi is a multivariate generalisation of this. Use Hpi for full bandwidth matrices and Hpi.diag for diagonal bandwidth matrices.

For AMSE pilot bandwidths, see Wand & Jones (1994). For SAMSE pilot bandwidths, see Duong & Hazelton (2003). The latter is a modification of the former, in order to remove any possible problems with non-positive definiteness. Unconstrained pilot bandwidths are available for d = 1, ..., 5 (but are extremely computationally intensive for the latter dimensions). See Chacon & Duong (2010).

For d = 1, 2, 3, 4 and binned=TRUE, estimates are computed over a binning grid defined by bgridsize. Otherwise it's computed exactly. For details on the pre-transformations in pre, see pre.sphere and pre.scale.

If Hstart is not given then it defaults to k*var(x) where $k=\left[\frac{4}{n(d+2)}\right]^{2/(d+4)}$, n = sample size, d = dimension of data. For large samples, k-fold bandwidth selection can significantly reduce computation time. The full data sample is partitioned into k sub-samples and a bandwidth matrix is computed for each of these sub-samples. The bandwidths are averaged and re-weighted to serve as a proxy for the full sample selector. (Temporarily disabled).

References

Chacon, J.E. & Duong, T. (2010) Multivariate plug-in bandwidth selection with unconstrained pilot matrices. Test, 19, 375-398. Duong, T. & Hazelton, M.L. (2003) Plug-in bandwidth matrices for bivariate kernel density estimation. Journal of Nonparametric Statistics, 15, 17-30. Sheather, S.J. & Jones, M.C. (1991) A reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society, Series B, 53, 683-690. Wand, M.P. & Jones, M.C. (1994) Multivariate plugin bandwidth selection. Computational Statistics, 9, 97-116.

Examples

Run this code

data(unicef)
Hpi(unicef)
Hpi(unicef, pilot="unconstr")
Hpi.diag(unicef, binned=TRUE)
hpi(unicef[,1])

Run the code above in your browser using DataLab