Uses bootstrap methods to compute approximate confidence intervals for error variances in a heteroskedastic linear regression model based on an auxiliary linear variance model (ALVM) or auxiliary nonlinear variance model (ANLVM).
avm.ci(
object,
bootobject = NULL,
bootavmobject = NULL,
jackobject = NULL,
bootCImethod = c("pct", "bca", "stdnorm"),
bootsampmethod = c("pairs", "wild"),
Bextra = 500L,
Brequired = 1000L,
conf.level = 0.95,
expand = TRUE,
retune = FALSE,
resfunc = c("identity", "hccme"),
qtype = 6,
rm_on_constraint = TRUE,
rm_nonconverged = TRUE,
jackknife_point = FALSE,
...
)
An object of class "avm.ci"
, containing the following:
climits
, an \(n\times 2\) matrix with lower confidence
limits in the first column and upper confidence limits in the second
var.est
, a vector of length \(n\) of point estimates
\(\hat{\omega}\) of the error variances. This is the same vector
passed within object
, unless jackknife_point
is
TRUE
.
conf.level
, corresponding to the eponymous argument
bootCImethod
, corresponding to the eponymous argument
bootsampmethod
, corresponding to the eponymous argument or
otherwise extracted from bootobject
An object of class "alvm.fit"
or of class
"anlvm.fit"
, containing information on a fitted ALVM or ANLVM
An object of class "bootlm"
, containing information
on a set of \(B\) bootstrapped versions of a linear regression model,
obtained by a nonparametric bootstrap method suitable for heteroskedastic
linear models. If set to NULL
(the default), it is generated by
calling bootlm
.
An object of class "bootavm"
, containing
information on an ALVM or ANLVM fit to \(B\) bootstrapped linear
regression models. If set to NULL
(the default), it is generated
by calling the non-exported function bootavm
.
An object of class "jackavm"
, containing
information on ALVMs or ANVLMs fit to jackknife versions of a linear
regression model. If set to NULL
(the default), it is generated
by calling the non-exported function jackavm
.
A character specifying the method to use when computing
the approximate bootstrap confidence interval. The default, "pct"
,
corresponds to the percentile interval. "bca"
corresponds to the
Bias-Corrected and accelerated (BCa) modification of the percentile
interval. "stdnorm"
corresponds to a naive standard normal
interval with bootstrap standard error estimates.
A character specifying the method to use for
generating nonparametric bootstrap linear regression models. Corresponds
to the sampmethod
argument of bootlm
and defaults
to "pairs"
. Warning: in simulations, bootstrap intervals
computed using the wild bootstrap have shown very poor coverage
probabilities. Ignored unless bootobject
is NULL
.
An integer indicating the maximum number of additional
bootstrap models that should be fitted in an attempt to obtain
Brequired
appropriate sets of bootstrap variance estimates, as
explained above under Brequired
. Defaults to 500L
.
Ignored if rm_on_constraint
is set to FALSE
(for an ALVM)
or if rm_nonconverged
is set to FALSE
(for an ANLVM).
An integer indicating the number of bootstrap regression
models that should be used to compute the bootstrap confidence intervals.
The default behaviour is to base the interval estimates only on bootstrap
ALVM variance estimates that are not on the constraint boundary or on
bootstrap ANLVMs where the estimation algorithm converged. Consequently,
if this is not the case for all of the first Brequired
bootstrap
models, additional bootstrap models are used (up to a maximum of
Bextra
). Defaults to 1000L
.
A double representing the confidence level \(1-\alpha\);
must be between 0 and 1. Defaults to 0.95
.
A logical specifying whether to implement the expansion
technique described in Hesterberg15;textualskedastic.
Defaults to TRUE
.
A logical specifying whether to re-tune hyperparameters and
re-select features each time an ALVM or (in the case of feature
selection) ANLVM is fit to a bootstrapped regression model. If
FALSE
(the default), the hyperparameter value and selected
features from the ALVM fit to the original model are reused in every
bootstrap model. Setting to TRUE
is more theoretically sound but
increases computation time substantially.
Either a character naming a function to call to apply a
transformation to the Ordinary Least Squares residuals, or a function
to apply for the same purpose. This argument is ignored if
sampmethod
is "pairs"
. The only two character values
accepted are "identity"
, in which case no transformation is
applied to the residuals, and "hccme"
, in which case the
transformation corresponds to a heteroskedasticity-consistent
covariance matrix estimator calculated from hccme
. If
resfunc
is a function, it is assumed that its first argument
is the numeric vector of residuals.
A numeric corresponding to the type
argument of
quantile
. Defaults to 6
.
A logical specifying whether to exclude
bootstrapped ALVMs from the interval estimation method where the ALVM
parameter estimate falls on the constraint boundary. Defaults to
TRUE
.
A logical specifying whether to exclude bootstrapped
ANLVMs from the interval estimation method where the optimisation
algorithm used in quasi-likelihood estimation of the ANLVM parameter did
not converge. Defaults to TRUE
.
A logical specifying whether to replace the point
estimates of the error variances \(\omega\) with jackknife estimates
based only on the leave-one-out auxiliary models where the parameter
estimates do not lie on the constraint boundary (in the ALVM case) or
where the quasi-likelihood estimation algorithm converged (in the ANLVM
case). Defaults to FALSE
.
Other arguments to pass to non-exported helper functions
\(B\) resampled versions of the original linear regression model
(which can be accessed using object$ols
) are generated using a
nonparametric bootstrap method that is suitable for heteroskedastic
linear regression models, namely either the pairs bootstrap or the wild
bootstrap (bootstrapping residuals is not suitable). Depending on
the class of object
, either an ALVM or an ANLVM is fit to each of
the bootstrapped regression models. The distribution of the \(B\)
bootstrap estimates of each error variance \(\omega_i\),
\(i=1,2,\ldots,n\), is used to construct an approximate confidence
interval for \(\omega_i\). This is done using one of three methods.
The first is the percentile interval, which simply takes the empirical
\(\alpha/2\) and \(1-\alpha/2\) quantiles of the \(i\)th bootstrap
variance estimates. The second is the Bias-Corrected and accelerated
(BCa) method as described in Efron93;textualskedastic,
which is intended to improve on the percentile interval (although
simulations have not found it to yield better coverage probabilities).
The third is the naive standard normal interval, which takes
\(\hat{\omega}_i \pm z_{1-\alpha/2} \widehat{\mathrm{SE}}\), where
\(\widehat{\mathrm{SE}}\) is the standard deviation of the \(B\)
bootstrap estimates of \(\omega_i\). By default, the expansion
technique described in Hesterberg15;textualskedastic is
also applied; evidence from simulations suggests that this does
improve coverage probabilities.
Technically, the hyperparameters of the ALVM, such as \(\lambda\) (for a
penalised polynomial or thin-plate spline model) or \(n_c\) (for a
clustering model) should be re-tuned every time the ALVM is fitted to
another bootstrapped regression model. However, due to the computational
cost, this is not done by avm.ci
unless retune
is set to
TRUE
.
When obtained from ALVMs, bootstrap estimates of \(\omega_i\) that fall on
the constraint boundary (i.e., are estimated to be near 0) are ignored
by default; there is an attempt to obtain Brequired
bootstrap
estimates of each \(\omega_i\) that do not fall on the constraint
boundary. This fine-tuning can be turned off by setting the
rm_onconstraint
argument to FALSE
; the amount of effort
put into obtaining non-boundary estimates is controlled using the
Bextra
argument. When ANLVMs are used, the default behaviour is to
try to obtain Brequired
bootstrap estimates of \(\omega\) where
the Gauss-Newton algorithm applied for quasi-likelihood estimation
has converged, and ignore estimates obtained from non-convergent cases.
This behaviour can be toggled using the rm_nonconverged
argument.
alvm.fit
, anlvm.fit
,
Efron93;textualskedastic
mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
myalvm <- alvm.fit(mtcars_lm, model = "cluster")
# Brequired would of course not be so small in practice
ci.alvm <- avm.ci(myalvm, Brequired = 5)
Run the code above in your browser using DataLab