higherMomentsIV: Fitting Linear Models with Endogenous Regressors using Lewbel's Higher Moments Approach

Description

Fits linear models with one endogenous regressor using internal instruments built using the approach described in Lewbel A. (1997). This is a statistical technique to address the endogeneity problem where no external instrumental variables are needed. The implementation allows the incorporation of external instruments if available. An important assumption for identification is that the endogenous variable has a skewed distribution.

Usage

higherMomentsIV(y, X, P, G = NULL, IIV = c("g", "gp", "gy", "yp", "p2",
  "y2"), EIV = NULL, data = NULL)

Arguments

the vector or matrix containing the dependent variable.

the data frame or matrix containing the exogenous regressors of the model.

the endogenous variables of the model as columns of a matrix or dataframe.

the functional form of G. It can take four values, x2, x3,lnx or 1/x. The last two forms are conditional on the values of the exogenous variables: greater than 1 or different from 0 respectively.

IIV

stands for "internal instrumental variable". It can take six values: g,gp,gy,yp,p2 or y2. Tells the function which internal instruments to be constructed from the data. See "Details" for further explanations.

EIV

stands for "external instrumental variable". It is an optional argument that lets the user specify any external variable(s) to be used as instrument(s).

data

optional data frame or list containing the variables in the model.

Value

Returns an object of class ivreg, with the following components:

coefficients

parameters estimates.

residulas

a vector of residuals.

fitted.values

a vector of predicted means.

number of observations.

df.residual

residual degrees of freedom for the fitted model.

cov.unscaled

unscaled covariance matrix for coefficients.

sigma

residual standard error.

call

the original function call.

formula

the model formula.

terms

a list with elements "regressors" and "instruments" containing the terms objects for the respective components.

levels

levels of the categorical regressors.

contrasts

the contrasts used foe categorical regressors.

a list with elements "regressors", "instruments", "projected", containing the model matrices from the respective components. "projected" is the matrix of regressors projected on the image of the instruments.

Details

Consider the model below: $$ Y_{t} = \beta_{0}+ \gamma^{'}X_{t} + \alpha P_{t}+\epsilon_{t} \hspace{0.3cm} (1) $$ $$ P_{t} = Z_{t}+\nu_{t} \hspace{2.5 cm} (2)$$ The observed data consist of $Y_{t}$, $X_{t}$ and $P_{t}$, while $Z_{t}$, $\epsilon_{t}$, and $\nu_{t}$ are unobserved. The endogeneity problem arises from the correlation of $P_{t}$ with the structural error, $\epsilon_{t}$, since $E(\epsilon \nu)\neq 0$. The requirement for the structural and measurement error is to have mean zero, but no restriction is imposed on their distribution. Let $\bar{S}$ be the sample mean of a variable $S_{t}$ and $G_{t} = G(X_{t})$ for any given function $G$ that has finite third own and cross moments. Lewbel(1997) proves that the following instruments can be constructed and used with 2SLS to obtain consistent estimates: $$ q_{1t}=(G_{t} - \bar{G}) \hspace{1.6 cm}(3a)$$ $$ q_{2t}=(G_{t} - \bar{G})(P_{t}-\bar{P}) \hspace{0.3cm} (3b) $$ $$ q_{3t}=(G_{t} - \bar{G})(Y_{t}-\bar{Y}) \hspace{0.3cm} (3c)$$ $$ q_{4t}=(Y_{t} - \bar{Y})(P_{t}-\bar{P}) \hspace{0.3cm} (3d)$$ $$ q_{5t}=(P_{t}-\bar{P})^{2} \hspace{1.5 cm} (3e) $$ $$ q_{6t}=(Y_{t} - \bar{Y})^{2}\hspace{1.5 cm} (3f)$$ Instruments in equations 3e and 3f can be used only when the measurement and the structural errors are symmetrically distributed. Otherwise, the use of the instruments does not require any distributional assumptions for the errors. Given that the regressors $G(X) = X$ are included as instruments, $G(X)$ should not be linear in $X$ in equation 3a. Let small letter denote deviation from the sample mean: $s_{i} = S_{i}-\bar{S}$. Then, using as instruments the variables presented in equations 3 together with 1 and $X_{t}$, the two-stage-least-squares estimation will provide consistent estimates for the parameters in equation 1 under the assumptions exposed in Lewbel(1997).

References

Lewbel, A. (1997). Constructing Instruments for Regressions with Measurement Error when No Additional Data Are Available, with An Application to Patents and R&D. Econometrica, 65(5), 1201-1213.

Examples

Run this code

#load data 
data(dataHigherMoments)
y <- dataHigherMoments[,1]
X <- cbind(dataHigherMoments[,2],dataHigherMoments[,3])
colnames(X) <- c("X1","X2")
P <- dataHigherMoments[,4]

# call higherMomentsIV with internal instrument yp = (Y - mean(Y))(P - mean(P))
h <- higherMomentsIV(y,X,P, G = "x2", IIV = "yp")  

# build an additional instrument p2 = (P - mean(P))^2  using the internalIV() function 
eiv <- internalIV(y,X,P, G="x2", IIV = "p2")

# use the additional variable as external instrument in higherMomentsIV()
h1 <- higherMomentsIV(y,X,P,G = "x2",IIV = "yp", EIV=eiv) 
summary(h1)

# get the robust standard errors using robust.se() function from package ivpack
# library(ivpack)
# sder <- robust.se(h1)

Run the code above in your browser using DataLab