lprobust
implements local polynomial regression point estimators, with robust bias-corrected confidence intervals and inference procedures developed in Calonico, Cattaneo and Farrell (2018). See also Calonico, Cattaneo and Farrell (2020) for related optimality results.
It also implements other estimation and inference procedures available in the literature. See Wand and Jones (1995) and Fan and Gijbels (1996) for background references.
Companion commands: lpbwselect
for local polynomial data-driven bandwidth selection, and nprobust.plot
for plotting results.
A detailed introduction to this command is given in Calonico, Cattaneo and Farrell (2019). For more details, and related Stata and R packages useful for empirical analysis, visit https://nppackages.github.io/.
lprobust(y, x, eval = NULL, neval = NULL, p = NULL, deriv = NULL,
h = NULL, b = NULL, rho = 1, kernel = "epa", bwselect = NULL,
bwcheck = 21, bwregul = 1, imsegrid = 30, vce = "nn", covgrid = FALSE,
cluster = NULL, nnmatch = 3, level = 95, interior = FALSE, subset = NULL)
dependent variable.
independent variable.
vector of evaluation point(s). By default it uses 30 equally spaced points over to support of x
.
number of quantile-spaced evaluation points on support of x
. Default is neval=30
.
polynomial order used to construct point estimator; default is p = 1
(local linear regression).
derivative order of the regression function to be estimated. Default is deriv=0
(regression function).
main bandwidth used to construct local polynomial point estimator. Can be either scalar (same bandwidth for all evaluation points), or vector of same dimension as eval
. If not specified, bandwidth h
is computed by the companion command lpbwselect
.
bias bandwidth used to construct the bias-correction estimator. Can be either scalar (same bandwidth for all evaluation points), or vector of same dimension as eval
. By default it is set equal to h
. If rho
is set to zero, b
is computed by the companion command lpbwselect
.
Sets b=h/rho
. Default is rho = 1
.
kernel function used to construct local polynomial estimators. Options are epa
for the epanechnikov kernel, tri
for the triangular kernel, uni
for the uniform kernel and gau
for the gaussian kernel. Default is kernel = epa
.
bandwidth selection procedure to be used via lpbwselect
. By default it computes h
and sets b=h/rho
(with rho=1
by default). It computes both h
and b
if rho
is set equal to zero. Options are:
mse-dpi
second-generation DPI implementation of MSE-optimal bandwidth. Default option if only one evaluation point is chosen.
mse-rot
ROT implementation of MSE-optimal bandwidth.
imse-dpi
second-generation DPI implementation of IMSE-optimal bandwidth (computed using a grid of evaluation points). Default option if more than one evaluation point is chosen.
imse-rot
ROT implementation of IMSE-optimal bandwidth (computed using a grid of evaluation points).
ce-dpi
second generation DPI implementation of CE-optimal bandwidth.
ce-rot
ROT implementation of CE-optimal bandwidth.
all
reports all available bandwidth selection procedures.
Note: MSE = Mean Square Error; IMSE = Integrated Mean Squared Error; CE = Coverage Error; DPI = Direct Plug-in; ROT = Rule-of-Thumb. For details on implementation see Calonico, Cattaneo and Farrell (2019).
if a positive integer is provided, then the selected bandwidth is enlarged so that at least bwcheck
effective observations are available at each evaluation point. Default is bwcheck = 21
.
specifies scaling factor for the regularization term added to the denominator of bandwidth selectors. Setting bwregul = 0
removes the regularization term from the bandwidth selectors. Default is bwregul = 1
.
number of evaluations points used to compute the IMSE bandwidth selector. Default is imsegrid = 30
.
procedure used to compute the variance-covariance matrix estimator. Options are:
nn
heteroskedasticity-robust nearest neighbor variance estimator with nnmatch
the (minimum) number of neighbors to be used. Default choice.
hc0
heteroskedasticity-robust plug-in residuals variance estimator without weights.
hc1
heteroskedasticity-robust plug-in residuals variance estimator with hc1
weights.
hc2
heteroskedasticity-robust plug-in residuals variance estimator with hc2
weights.
hc3
heteroskedasticity-robust plug-in residuals variance estimator with hc3
weights.
if TRUE, it computes two covariance matrices (cov.us and cov.rb) for classical and robust covariances across point estimators over the grid of evaluation points.
indicates the cluster ID variable used for cluster-robust variance estimation with degrees-of-freedom weights. By default it is combined with vce=nn
for cluster-robust nearest neighbor variance estimation. Another option is plug-in residuals combined with vce=hc1
.
to be combined with for vce=nn
for heteroskedasticity-robust nearest neighbor variance estimator with nnmatch
indicating the minimum number of neighbors to be used. Default is nnmatch=3
confidence level used for confidence intervals; default is level = 95
.
if TRUE, all evaluation points are assumed to be interior points. This option affects only data-driven bandwidth selection via lpbwselect
. Default is interior = FALSE
.
optional rule specifying a subset of observations to be used.
A matrix containing eval
(grid points), h
, b
(bandwidths), N
(effective sample sizes), m.us
(point estimates with p-th order local polynomial),
tau.bc
(bias corrected point estimates with (p+1)-th order local polynomial,
se.us
(standard error corresponding to tau.us
), and se.rb
(robust standard error).
A list containing options passed to the function.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2018. On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference. Journal of the American Statistical Association, 113(522): 767-779. doi:10.1080/01621459.2017.1285776.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2019. nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference. Journal of Statistical Software, 91(8): 1-33. http://dx.doi.org/10.18637/jss.v091.i08.
Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2020. Coverage Error Optimal Confidence Intervals for Local Polynomial Regression. Working Paper.
Fan, J., and Gijbels, I. 1996. Local polynomial modelling and its applications, London: Chapman and Hall.
Wand, M., and Jones, M. 1995. Kernel Smoothing, Florida: Chapman & Hall/CRC.
# NOT RUN {
x <- runif(500)
y <- sin(4*x) + rnorm(500)
est <- lprobust(y,x)
summary(est)
# }
Run the code above in your browser using DataLab