Learn R Programming

decon (version 1.3-4)

DeconCdf: Estimating cumulative distribution function from data with measurement error

Description

To compute the cumulative distribution function from data coupled with measurement error. The measurement errors can be either homoscedastic or heteroscedastic.

Usage

DeconCdf(y,sig,x,error="normal",bw="dboot1",adjust=1,
	n=512,from,to,cut=3,na.rm=FALSE,grid=100,ub=2,...)

Arguments

y

The observed data. It is a vector of length at least 3.

sig

The standard deviations \(\sigma\). If homoscedastic errors, \(sig\) is a single value. If heteroscedastic errors, \(sig\) is a vector of standard deviations having the same length as \(y\).

x

x is user-defined grids where the CDF will be evaluated. FFT method is not applicable if x is given.

error

Error distribution types: (1) 'normal' for normal errors; (2) 'laplacian' for Laplacian errors; (3) 'snormal' for a special case of small normal errors.

bw

Specifies the bandwidth. It can be a single numeric value which has been pre-determined; or computed with the specific bandwidth selector: 'dnrd' to compute the rule-of-thumb plugin bandwidth as suggested by Fan (1991); 'dmise' to compute the plugin bandwidth by minimizing MISE; 'dboot1' to compute the bootstrap bandwidth selector without resampling (Delaigle and Gijbels, 2004a), which minimizing the MISE bootstrap bandwidth selectors; 'boot2' to compute the smoothed bootstrap bandwidth selector with resampling.

adjust

adjust the range there the CDF is to be evaluated. By default, \(adjust=1\).

n

number of points where the CDF is to be evaluated.

from

the starting point where the CDF is to be evaluated.

to

the starting point where the CDF is to be evaluated.

cut

used to adjust the starting end ending points where the CDF is to be evaluated.

na.rm

is set to FALSE by default: no NA value is allowed.

grid

the grid number to search the optimal bandwidth when a bandwidth selector was specified in bw. Default value "grid=100".

ub

the upper boundary to search the optimal bandwidth, default value is "ub=2".

...

control

Value

An object of class ``Decon''.

Details

FFT is currently not supported for CDF computing.

References

Delaigle, A. and Gijbels, I. (2004). Bootstrap bandwidth selection in kernel density estimation from a contaminated sample. Annals of the Institute of Statistical Mathematics, 56(1), 19-47.

Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. The Annals of Statistics, 19, 1257-1272.

Fan, J. (1992). Deconvolution with supersmooth distributions. The Canadian Journal of Statistics, 20, 155-169.

Hall, P. and Lahiri, S.N. (2008). Estimation of distributions, moments and quantiles in deconvolution problems. Annals of Statistics, 36(5), 2110-2134.

Stefanski L.A. and Carroll R.J. (1990). Deconvoluting kernel density estimators. Statistics, 21, 169-184.

Wang, X.F., Fan, Z. and Wang, B. (2010). Estimating smooth distribution function in the presence of heterogeneous measurement errors. Computational Statistics and Data Analysis, 54, 25-36.

Wang, X.F. and Wang, B. (2011). Deconvolution estimation in measurement error models: The R package decon. Journal of Statistical Software, 39(10), 1-24.

See Also

DeconPdf, DeconNpr.

Examples

Run this code
# NOT RUN {
#####################
## the R function to estimate the smooth distribution function
#SDF <- function (x, bw = bw.nrd0(x), n = 512, lim=1){
#        dx <- lim*sd(x)/20 
#        xgrid <- seq(min(x)-dx, max(x)+dx, length = n)
#        Fhat <- sapply(x, function(x) pnorm((xgrid-x)/bw)) 
#        return(list(x = xgrid, y = rowMeans(Fhat)))
#    }

## Case study: homoscedastic normal errors
n2 <- 100
x2 <- c(rnorm(n2/2,-3,1),rnorm(n2/2,3,1))
sig2 <- .8
u2 <- rnorm(n2, sd=sig2)
w2 <- x2+u2
# estimate the bandwidth with the bootstrap method with resampling
bw2 <- bw.dboot2(w2,sig=sig2, error="normal")
# estimate the distribution function with measurement error
F2 <-  DeconCdf(w2,sig2,error='normal',bw=bw2)
plot(F2,  col="red", lwd=3, lty=2, xlab="x", ylab="F(x)", main="")

#lines(SDF(x2), lwd=3, lty=1)
#lines(SDF(w2), col="blue", lwd=3, lty=3)

# }

Run the code above in your browser using DataLab