Auxiliary function (i.e. not intended for the average user) that enables block-based GETS-modelling with user-specified estimator, diagnostics and goodness-of-fit criterion.
blocksFun(y, x, untransformed.residuals=NULL, blocks=NULL,
no.of.blocks=NULL, max.block.size=30, ratio.threshold=0.8,
gets.of.union=TRUE, force.invertibility=FALSE,
user.estimator=list(name="ols"), t.pval=0.001, wald.pval=t.pval,
do.pet=FALSE, ar.LjungB=NULL, arch.LjungB=NULL, normality.JarqueB=NULL,
user.diagnostics=NULL, gof.function=list(name="infocrit"),
gof.method=c("min", "max"), keep=NULL, include.gum=FALSE,
include.1cut=FALSE, include.empty=FALSE, max.paths=NULL,
turbo=FALSE, parallel.options=NULL, tol=1e-07, LAPACK=FALSE,
max.regs=NULL, print.searchinfo=TRUE, alarm=FALSE)
A list
with the results of the block-based GETS-modelling.
a numeric vector (with no missing values, i.e. no non-numeric 'holes')
a matrix
, or a list
of matrices
NULL
(default) or, when ols
is used with method=6
in user.estimator
, a numeric vector containing the untransformed residuals
NULL
(default) or a list
of lists with vectors of integers that indicate how blocks should be put together. If NULL
, then the block composition is undertaken automatically by an internal algorithm that depends on no.of.blocks
, max.block.size
and ratio.threshold
NULL
(default) or integer
. If NULL
, then the number of blocks is determined automatically by an internal algorithm
integer
that controls the size of blocks
numeric
between 0 and 1 that controls the minimum ratio of variables in each block to total observations
logical
. If TRUE
(default), then GETS modelling is undertaken of the union of retained variables. Otherwise it is not
logical
. If TRUE
, then the x-matrix is ensured to have full row-rank before it is passed on to getsFun
list
, see getsFun
for the details
numeric
value between 0 and 1. The significance level used for the two-sided coefficient significance t-tests
numeric
value between 0 and 1. The significance level used for the Parsimonious Encompassing Tests (PETs)
logical
. If TRUE
, then a Parsimonious Encompassing Test (PET) against the GUM is undertaken at each variable removal for the joint significance of all the deleted regressors along the current GETS path. If FALSE
, then a PET is not undertaken at each removal
a two element vector
, or NULL
. In the former case, the first element contains the AR-order, the second element the significance level. If NULL
, then a test for autocorrelation in the residuals is not conducted
a two element vector
, or NULL
. In the former case, the first element contains the ARCH-order, the second element the significance level. If NULL
, then a test for ARCH in the residuals is not conducted
NULL
or a numeric
value between 0 and 1. In the latter case, a test for non-normality in the residuals is conducted using a significance level equal to
normality.JarqueB
. If NULL
, then no test for non-normality is conducted
NULL
(default) or a list
with two entries, name
and pval
. See getsFun
for the details
list
. The first item should be named name
and contain the name (a character) of the Goodness-of-Fit (GOF) function used. Additional items in the list gof.function
are passed on as arguments to the GOF-function. . See getsFun
for the details
character
. Determines whether the best Goodness-of-Fit is a minimum (default) or maximum
NULL
(default), vector
of integers or a list
of vectors of integers. In the latter case, the number of vectors should be equal to the number of matrices in x
logical
. If TRUE
, then the GUM (i.e. the starting model) is included among the terminal models
logical
. If TRUE
, then the 1-cut model is added to the list of terminal models
logical
. If TRUE
, then the empty model is added to the list of terminal models
NULL
(default) or integer
greater than 0. If NULL
, then there is no limit to the number of paths. If integer
(e.g. 1), then this integer constitutes the maximum number of paths searched (e.g. a single path)
logical
. If TRUE
, then (parts of) paths are not searched twice (or more) unnecessarily in each GETS modelling. Setting turbo
to TRUE
entails a small additional computational costs, but may be outweighed substantially if estimation is slow, or if the number of variables to delete in each path is large
NULL
or integer
that indicates the number of cores/threads to use for parallel computing (implemented w/makeCluster
and parLapply
)
numeric
value, the tolerance for detecting linear dependencies in the columns of the variance-covariance matrix when computing the Wald-statistic used in the Parsimonious Encompassing Tests (PETs), see the qr.solve
function
currently not used
integer
. The maximum number of regressions along a deletion path. Do not alter unless you know what you are doing!
logical
. If TRUE
(default), then a print is returned whenever simiplification along a new path is started
logical
. If TRUE
, then a sound or beep is emitted (in order to alert the user) when the model selection ends
Genaro Sucarrat, with contributions from Jonas kurle, Felix Pretis and James Reade
blocksFun
undertakes block-based GETS modelling by a repeated but structured call to getsFun
. For the details of how to user-specify an estimator via user.estimator
, diagnostics via
user.diagnostics
and a goodness-of-fit function via gof.function
, see documentation of getsFun
under "Details".
The algorithm of blocksFun
is similar to that of isat
, but more flexible. The main use of blocksFun
is the creation of user-specified methods that employs block-based GETS modelling, e.g. indicator saturation techniques.
F. Pretis, J. Reade and G. Sucarrat (2018): 'Automated General-to-Specific (GETS) Regression Modeling and Indicator Saturation for Outliers and Structural Breaks'. Journal of Statistical Software 86, Number 3, pp. 1-44
G. sucarrat (2020): 'User-Specified General-to-Specific and Indicator Saturation Methods'. The R Journal 12 issue 2, pp. 388-401, https://journal.r-project.org/archive/2021/RJ-2021-024/
getsFun
, ols
, diagnostics
, infocrit
and isat
## more variables than observations:
y <- rnorm(20)
x <- matrix(rnorm(length(y)*40), length(y), 40)
blocksFun(y, x)
## 'x' as list of matrices:
z <- matrix(rnorm(length(y)*40), length(y), 40)
blocksFun(y, list(x,z))
## ensure regressor no. 3 in matrix no. 2 is not removed:
blocksFun(y, list(x,z), keep=list(integer(0), 3))
Run the code above in your browser using DataLab