errorsarlm: Spatial simultaneous autoregressive error model estimation

Description

Maximum likelihood estimation of spatial simultaneous autoregressive error models of the form:

$$y = X \beta + u, u = \lambda W u + \varepsilon$$

where $\lambda$ is found by optimize() first, and $\beta$ and other parameters by generalized least squares subsequently. With one of the sparse matrix methods, larger numbers of observations can be handled, but the interval= argument may need be set when the weights are not row-standardised. When etype is “emixed”, a so-called spatial Durbin error model is fitted, while lmSLX fits an lm model augmented with the spatially lagged RHS variables, including the lagged intercept when the spatial weights are not row-standardised. create_WX creates spatially lagged RHS variables, and is exposed for use in model fitting functions.

Usage

errorsarlm(formula, data=list(), listw, na.action, weights=NULL,
 Durbin, etype, method="eigen", quiet=NULL, zero.policy=NULL,
 interval = NULL, tol.solve=1.0e-10, trs=NULL, control=list())
spBreg_err(formula, data = list(), listw, na.action, Durbin, etype,
    zero.policy=NULL, control=list())
lmSLX(formula, data = list(), listw, na.action, weights=NULL,
 Durbin=TRUE, zero.policy=NULL)
create_WX(x, listw, zero.policy=NULL, prefix="")

Value

A list object of class sarlm

type

"error"

lambda

simultaneous autoregressive error coefficient

coefficients

GLS coefficient estimates

rest.se

GLS coefficient standard errors (are equal to asymptotic standard errors)

log likelihood value at computed optimum

GLS residual variance

SSE

sum of squared GLS errors

parameters

number of parameters estimated

logLik_lm.model

Log likelihood of the linear model for $\lambda=0$

AIC_lm.model

AIC of the linear model for $\lambda=0$

% \item{lm.model}{the \code{lm} object returned when estimating for \eqn{\lambda=0}{lambda=0}}

coef_lm.model

coefficients of the linear model for $\lambda=0$

tarX

model matrix of the GLS model

tary

response of the GLS model

response of the linear model for $\lambda=0$

model matrix of the linear model for $\lambda=0$

method

the method used to calculate the Jacobian

call

the call used to create this object

residuals

GLS residuals

% \item{lm.target}{the \code{lm} object returned for the GLS fit}

opt

object returned from numerical optimisation

fitted.values

Difference between residuals and response variable

ase

TRUE if method=eigen

% \item{formula}{model formula}

se.fit

Not used yet

lambda.se

if ase=TRUE, the asymptotic standard error of $\lambda$

LMtest

NULL for this model

aliased

if not NULL, details of aliased variables

LLNullLlm

Log-likelihood of the null linear model

Hcov

Spatial DGP covariance matrix for Hausman test if available

interval

line search interval

fdHess

finite difference Hessian

optimHess

optim or fdHess used

insert

logical; is TRUE, asymptotic values inserted in fdHess where feasible

timings

processing timings

f_calls

number of calls to the log likelihood function during optimization

hf_calls

number of calls to the log likelihood function during numerical Hessian computation

intern_classic

a data frame of detval matrix row choices used by the SE toolbox classic method

zero.policy

zero.policy for this model

na.action

(possibly) named vector of excluded or omitted observations if non-default na.action argument used

weights

weights used in model fitting

emixedImps

for “emixed” models, a list of three impact matrixes (impacts and standard errors) for direct, indirect and total impacts; total impacts calculated using gmodels::estimable

The internal sar.error.* functions return the value of the log likelihood function at \lambdalambda.

The lmSLX function returns an lm object with a mixedImps list of three impact matrixes (impacts and standard errors) for direct, indirect and total impacts; total impacts calculated using gmodels::estimable.

Control arguments

tol.opt:: the desired accuracy of the optimization - passed to optimize() (default=square root of double precision machine tolerance, a larger root may be used needed, see help(boston) for an example)
returnHcov:: default TRUE, return the Vo matrix for a spatial Hausman test
pWOrder:: default 250, if returnHcov=TRUE and the method is not “eigen”, pass this order to powerWeights as the power series maximum limit
fdHess:: default NULL, then set to (method != "eigen") internally; use fdHess to compute an approximate Hessian using finite differences when using sparse matrix methods; used to make a coefficient covariance matrix when the number of observations is large; may be turned off to save resources if need be
optimHess:: default FALSE, use fdHess from nlme, if TRUE, use optim to calculate Hessian at optimum
optimHessMethod:: default “optimHess”, may be “nlm” or one of the optim methods
LAPACK:: default FALSE; logical value passed to qr in the SSE log likelihood function
compiled_sse:: default FALSE; logical value used in the log likelihood function to choose compiled code for computing SSE
Imult:: default 2; used for preparing the Cholesky decompositions for updating in the Jacobian function
super:: if NULL (default), set to FALSE to use a simplicial decomposition for the sparse Cholesky decomposition and method “Matrix_J”, set to as.logical(NA) for method “Matrix”, if TRUE, use a supernodal decomposition
cheb_q:: default 5; highest power of the approximating polynomial for the Chebyshev approximation
MC_p:: default 16; number of random variates
MC_m:: default 30; number of products of random variates matrix and spatial weights matrix
spamPivot:: default “MMD”, alternative “RCM”
in_coef: default 0.1, coefficient value for initial Cholesky decomposition in “spam_update”
type: default “MC”, used with method “moments”; alternatives “mult” and “moments”, for use if trs is missing, trW
correct: default TRUE, used with method “moments” to compute the Smirnov/Anselin correction term
trunc: default TRUE, used with method “moments” to truncate the Smirnov/Anselin correction term
SE_method: default “LU”, may be “MC”
nrho: default 200, as in SE toolbox; the size of the first stage lndet grid; it may be reduced to for example 40
interpn: default 2000, as in SE toolbox; the size of the second stage lndet grid
small_asy: default TRUE; if the method is not “eigen”, use asymmetric covariances rather than numerical Hessian ones if n <= small
small: default 1500; threshold number of observations for asymmetric covariances when the method is not “eigen”
SElndet: default NULL, may be used to pass a pre-computed SE toolbox style matrix of coefficients and their lndet values to the "SE_classic" and "SE_whichMin" methods
LU_order: default FALSE; used in “LU_prepermutate”, note warnings given for lu method
pre_eig: default NULL; may be used to pass a pre-computed vector of eigenvalues

Extra Bayesian control arguments

ldet_method

default “SE_classic”; equivalent to the method argument in lagsarlm

interval

default c(-1, 1); used unmodified or set internally by jacobianSetup

ndraw

default 2500L; integer total number of draws

nomit

default 500L; integer total number of omitted burn-in draws

thin

default 1L; integer thinning proportion

verbose

default FALSE; inverse of quiet argument in lagsarlm

detval

default NULL; not yet in use, precomputed matrix of log determinants

prior

a list with the following components:

lambdaMH: default FALSE; if TRUE use Metropolis or FALSE griddy Gibbs
Tbeta: default NULL; values of the betas variance-covariance matrix, set to diag(k)*1e+12 if NULL
c_beta: default NULL; values of the betas set to 0 if NULL
rho: default 0.5; value of the autoregressive coefficient
sige: default 1; value of the residual variance
nu: default 0; informative Gamma(nu,d0) prior on sige
d0: default 0; informative Gamma(nu,d0) prior on sige
a1: default 1.01; parameter for beta(a1,a2) prior on rho
a2: default 1.01; parameter for beta(a1,a2) prior on rho
cc: default 0.2; initial tuning parameter for M-H sampling
gG_sige: default TRUE; include sige in lambda griddy Gibbs update

Details

The asymptotic standard error of $\lambda$ is only computed when method=eigen, because the full matrix operations involved would be costly for large n typically associated with the choice of method="spam" or "Matrix". The same applies to the coefficient covariance matrix. Taken as the asymptotic matrix from the literature, it is typically badly scaled, being block-diagonal, and with the elements involving $\lambda$ being very small, while other parts of the matrix can be very large (often many orders of magnitude in difference). It often happens that the tol.solve argument needs to be set to a smaller value than the default, or the RHS variables can be centred or reduced in range.

Note that the fitted() function for the output object assumes that the response variable may be reconstructed as the sum of the trend, the signal, and the noise (residuals). Since the values of the response variable are known, their spatial lags are used to calculate signal components (Cressie 1993, p. 564). This differs from other software, including GeoDa, which does not use knowledge of the response variable in making predictions for the fitting data.

References

Cliff, A. D., Ord, J. K. 1981 Spatial processes, Pion; Ord, J. K. 1975 Estimation methods for models of spatial interaction, Journal of the American Statistical Association, 70, 120-126; Anselin, L. 1988 Spatial econometrics: methods and models. (Dordrecht: Kluwer); Anselin, L. 1995 SpaceStat, a software program for the analysis of spatial data, version 1.80. Regional Research Institute, West Virginia University, Morgantown, WV; Anselin L, Bera AK (1998) Spatial dependence in linear regression models with an introduction to spatial econometrics. In: Ullah A, Giles DEA (eds) Handbook of applied economic statistics. Marcel Dekker, New York, pp. 237-289; Cressie, N. A. C. 1993 Statistics for spatial data, Wiley, New York; LeSage J and RK Pace (2009) Introduction to Spatial Econometrics. CRC Press, Boca Raton.

Roger Bivand, Gianfranco Piras (2015). Comparing Implementations of Estimation Methods for Spatial Econometrics. Journal of Statistical Software, 63(18), 1-36. https://www.jstatsoft.org/v63/i18/.

Bivand, R. S., Hauke, J., and Kossowski, T. (2013). Computing the Jacobian in Gaussian spatial autoregressive models: An illustrated comparison of available methods. Geographical Analysis, 45(2), 150-179.

Examples

Run this code

# NOT RUN {
data(oldcol)
lw <- nb2listw(COL.nb, style="W")
ev <- eigenw(similar.listw(lw))
COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, quiet=FALSE, control=list(pre_eig=ev))
summary(COL.errW.eig)
COL.errW.eig_ev <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, control=list(pre_eig=ev))
all.equal(coefficients(COL.errW.eig), coefficients(COL.errW.eig_ev))
COL.errB.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb, style="B"))
summary(COL.errB.eig)
W <- as(nb2listw(COL.nb), "CsparseMatrix")
trMatc <- trW(W, type="mult")
COL.errW.M <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="Matrix", quiet=FALSE, trs=trMatc)
summary(COL.errW.M)
COL.SDEM.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, etype="emixed", control=list(pre_eig=ev))
summary(COL.SDEM.eig)
COL.SDEM.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, Durbin=TRUE, control=list(pre_eig=ev))
summary(COL.SDEM.eig)
COL.SDEM.eig <- errorsarlm(CRIME ~ DISCBD + INC + HOVAL, data=COL.OLD,
 lw, Durbin=~INC, control=list(pre_eig=ev))
summary(COL.SDEM.eig)
summary(impacts(COL.SDEM.eig))
COL.SLX <- lmSLX(CRIME ~ INC + HOVAL, data=COL.OLD, listw=lw)
summary(COL.SLX)
summary(impacts(COL.SLX))
COL.SLX <- lmSLX(CRIME ~ INC + HOVAL + I(HOVAL^2), data=COL.OLD, listw=lw, Durbin=TRUE)
summary(impacts(COL.SLX))
summary(COL.SLX)
COL.SLX <- lmSLX(CRIME ~ INC + HOVAL + I(HOVAL^2), data=COL.OLD, listw=lw, Durbin=~INC)
summary(impacts(COL.SLX))
summary(COL.SLX)
COL.SLX <- lmSLX(CRIME ~ INC, data=COL.OLD, listw=lw)
summary(COL.SLX)
summary(impacts(COL.SLX))
# }
# NOT RUN {
crds <- cbind(COL.OLD$X, COL.OLD$Y)
mdist <- sqrt(sum(diff(apply(crds, 2, range))^2))
dnb <- dnearneigh(crds, 0, mdist)
dists <- nbdists(dnb, crds)
f <- function(x, form, data, dnb, dists, verbose) {
  glst <- lapply(dists, function(d) 1/(d^x))
  lw <- nb2listw(dnb, glist=glst, style="B")
  res <- logLik(lmSLX(form=form, data=data, listw=lw))
  if (verbose) cat("power:", x, "logLik:", res, "\n")
  res
}
opt <- optimize(f, interval=c(0.1, 4), form=CRIME ~ INC + HOVAL,
 data=COL.OLD, dnb=dnb, dists=dists, verbose=TRUE, maximum=TRUE)
glst <- lapply(dists, function(d) 1/(d^opt$maximum))
lw <- nb2listw(dnb, glist=glst, style="B")
SLX <- lmSLX(CRIME ~ INC + HOVAL, data=COL.OLD, listw=lw)
summary(SLX)
summary(impacts(SLX))
# }
# NOT RUN {
NA.COL.OLD <- COL.OLD
NA.COL.OLD$CRIME[20:25] <- NA
COL.err.NA <- errorsarlm(CRIME ~ INC + HOVAL, data=NA.COL.OLD,
 nb2listw(COL.nb), na.action=na.exclude)
COL.err.NA$na.action
COL.err.NA
resid(COL.err.NA)
# }
# NOT RUN {
lw <- nb2listw(COL.nb, style="W")
print(system.time(ev <- eigenw(similar.listw(lw))))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="eigen", control=list(pre_eig=ev))))
ocoef <- coefficients(COL.errW.eig)
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="eigen", control=list(pre_eig=ev, LAPACK=FALSE))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="eigen", control=list(pre_eig=ev, compiled_sse=TRUE))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="Matrix_J", control=list(super=TRUE))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="Matrix_J", control=list(super=FALSE))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="Matrix_J", control=list(super=as.logical(NA)))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="Matrix", control=list(super=TRUE))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="Matrix", control=list(super=FALSE))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="Matrix", control=list(super=as.logical(NA)))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="spam", control=list(spamPivot="MMD"))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="spam", control=list(spamPivot="RCM"))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="spam_update", control=list(spamPivot="MMD"))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
print(system.time(COL.errW.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 lw, method="spam_update", control=list(spamPivot="RCM"))))
print(all.equal(ocoef, coefficients(COL.errW.eig)))
# }
# NOT RUN {
if (require(coda, quietly=TRUE)) {
set.seed(1)
COL.err.Bayes <- spBreg_err(CRIME ~ INC + HOVAL, data=COL.OLD, listw=lw)
print(summary(COL.err.Bayes))
print(raftery.diag(COL.err.Bayes, r=0.01))
set.seed(1)
COL.err.Bayes <- spBreg_err(CRIME ~ INC + HOVAL, data=COL.OLD, listw=lw,
 control=list(prior=list(lambdaMH=TRUE)))
print(summary(COL.err.Bayes))
print(raftery.diag(COL.err.Bayes, r=0.01))
set.seed(1)
COL.err.Bayes <- spBreg_err(CRIME ~ INC + HOVAL, data=COL.OLD, listw=lw,
 Durbin=TRUE)
print(summary(COL.err.Bayes))
print(summary(impacts(COL.err.Bayes)))
print(raftery.diag(COL.err.Bayes, r=0.01))
set.seed(1)
COL.err.Bayes <- spBreg_err(CRIME ~ INC + HOVAL, data=COL.OLD, listw=lw,
 Durbin=TRUE, control=list(prior=list(lambdaMH=TRUE)))
print(summary(COL.err.Bayes))
print(summary(impacts(COL.err.Bayes)))
print(raftery.diag(COL.err.Bayes, r=0.01))
set.seed(1)
COL.err.Bayes <- spBreg_err(CRIME ~ INC + HOVAL, data=COL.OLD, listw=lw,
 Durbin=~INC)
print(summary(COL.err.Bayes))
print(summary(impacts(COL.err.Bayes)))
print(raftery.diag(COL.err.Bayes, r=0.01))
set.seed(1)
COL.err.Bayes <- spBreg_err(CRIME ~ INC + HOVAL, data=COL.OLD, listw=lw,
 Durbin=~INC, control=list(prior=list(lambdaMH=TRUE)))
print(summary(COL.err.Bayes))
print(summary(impacts(COL.err.Bayes)))
print(raftery.diag(COL.err.Bayes, r=0.01))
}
# }

Run the code above in your browser using DataLab