kcde: Kernel cumulative distribution/survival function estimate

Description

Kernel cumulative distribution/survival function estimate for 1- to 3-dimensional data.

Usage

kcde(x, H, h, gridsize, gridtype, xmin, xmax, supp=3.7, eval.points,
  binned=FALSE, bgridsize, positive=FALSE, adj.positive, w, verbose=FALSE,
  tail.flag="lower.tail")
Hpi.kcde(x, nstage=2, pilot, Hstart, binned=FALSE, bgridsize, amise=FALSE,
  verbose=FALSE, optim.fun="nlm")
Hpi.diag.kcde(x, nstage=2, pilot, Hstart, binned=FALSE, bgridsize, amise=FALSE,
  verbose=FALSE, optim.fun="nlm")
hpi.kcde(x, nstage=2, binned=TRUE, amise=FALSE)
# S3 method for kcde
predict(object, ..., x)

Arguments

matrix of data values

H,h

bandwidth matrix/scalar bandwidth. If these are missing, then Hpi.kcde or hpi.kcde is called by default.

gridsize

vector of number of grid points

gridtype

not yet implemented

xmin,xmax

vector of minimum/maximum values for grid

supp

effective support for standard normal

eval.points

vector or matrix of points at which estimate is evaluated

binned

flag for binned estimation. Default is FALSE.

bgridsize

vector of binning grid sizes

positive

flag if 1-d data are positive. Default is FALSE.

adj.positive

adjustment applied to positive 1-d data

not yet implemented

verbose

flag to print out progress information. Default is FALSE.

tail.flag

"lower.tail" = cumulative distribution, "upper.tail" = survival function

nstage

number of stages in the plug-in bandwidth selector (1 or 2)

pilot

"dscalar" = single pilot bandwidth (default for Hpi.diag.kcde "dunconstr" = single unconstrained pilot bandwidth (default for Hpi.kcde

Hstart

initial bandwidth matrix, used in numerical optimisation

amise

flag to return the minimal scaled PI value

optim.fun

optimiser function: one of nlm or optim

object

object of class kcde

...

other parameters

Value

A kernel cumulative distribution estimate is an object of class kcde which is a list with fields:

data points - same as input

eval.points

vector or list of points at which the estimate is evaluated

estimate

cumulative distribution/survival function estimate at eval.points

scalar bandwidth (1-d only)

bandwidth matrix

gridtype

"linear"

gridded

flag for estimation on a grid

binned

flag for binned estimation

names

variable names

weights

tail

"lower.tail"=cumulative distribution, "upper.tail"=survival function

Details

If tail.flag="lower.tail" then the cumulative distribution function \(\mathrm{Pr}(\bold{X}\leq\bold{x})\) is estimated, otherwise if tail.flag="upper.tail", it is the survival function \(\mathrm{Pr}(\bold{X}>\bold{x})\). For d>1, \(\mathrm{Pr}(\bold{X}\leq\bold{x}) \neq 1 - \mathrm{Pr}(\bold{X}>\bold{x})\).

If the bandwidth H is missing in kcde, then the default bandwidth is the plug-in selector Hpi.kcde. Likewise for missing h. No pre-scaling/pre-sphering is used since the Hpi.kcde is not invariant to translation/dilation.

The effective support, binning, grid size, grid range, positive parameters are the same as kde.

References

Duong, T. (2016) Non-parametric smoothed estimation of multivariate cumulative distribution and survival functions, and receiver operating characteristic curves. Journal of the Korean Statistical Society. 45, 33-50.

Examples

Run this code

# NOT RUN {
library(MASS)
data(iris)
Fhat <- kcde(iris[,1:2])  
predict(Fhat, x=iris[,1:2])

## See other examples in ? plot.kcde
# }

Run the code above in your browser using DataLab