Here is a description of some common and
typical arguments found in many VGAM
family functions, e.g.,
zero
,
lsigma
,
isigma
,
gsigma
,
eq.mean
,
nsimEI
and
parallel
.
TypicalVGAMfamilyFunction(lsigma = "loglink",
isigma = NULL,
zero = NULL, gsigma = exp(-5:5),
eq.mean = FALSE,
parallel = TRUE,
imethod = 1,
vfl = FALSE, Form2 = NULL,
type.fitted = c("mean", "quantiles", "Qlink",
"pobs0", "pstr0", "onempstr0"),
percentiles = c(25, 50, 75),
probs.x = c(0.15, 0.85),
probs.y = c(0.25, 0.50, 0.75),
multiple.responses = FALSE, earg.link = FALSE,
ishrinkage = 0.95, nointercept = NULL,
whitespace = FALSE, bred = FALSE, lss = TRUE,
oim = FALSE, nsimEIM = 100, byrow.arg = FALSE,
link.list = list("(Default)" = "identitylink",
x2 = "loglink",
x3 = "logofflink",
x4 = "multilogitlink",
x5 = "multilogitlink"),
earg.list = list("(Default)" = list(),
x2 = list(),
x3 = list(offset = -1),
x4 = list(),
x5 = list()),
Thresh = NULL, nrfs = 1)
An object of class "vglmff"
(see
vglmff-class
). The object
is used by modelling functions such as
vglm
and vgam
.
Character.
Link function applied to a parameter and not
necessarily a mean. See Links
for a selection of choices. If there is only
one parameter then this argument is often
called link
.
Optional initial values can often be
inputted using an argument beginning with
"i"
. For example, "isigma"
and
"ilocation"
, or just "init"
if there is one parameter. A value of
NULL
means a value is computed
internally, i.e., a self-starting
VGAM family function. If a failure
to converge occurs make use of these types
of arguments.
An important argument, either an integer vector, or a vector of character strings.
If an integer, then it specifies which
linear/additive predictor is modelled
as intercept-only. That is,
the regression coefficients are set to
zero for all covariates except for the
intercept. If zero
is specified
then it may be a vector with values from
the set \(\{1,2,\ldots,M\}\). The value
zero = NULL
means model all
linear/additive predictors as functions of
the explanatory variables. Here, \(M\)
is the number of linear/additive predictors.
Technically, if zero
contains the
value \(j\) then the \(j\)th row of every
constraint matrix (except for the intercept)
consists of all 0 values.
Some VGAM family functions allow the
zero
argument to accept negative
values; if so then its absolute value is
recycled over each (usual) response. For
example, zero = -2
for the
two-parameter negative binomial distribution
would mean, for each response, the second
linear/additive predictor is modelled as
intercepts-only. That is, for all the \(k\)
parameters in negbinomial
(this VGAM family function can handle
a matrix of responses).
Suppose zero = zerovec
where
zerovec
is a vector of negative
values. If \(G\) is the usual \(M\)
value for a univariate response then the
actual values for argument zero
are all values in c(abs(zerovec), G +
abs(zerovec), 2*G + abs(zerovec), ... )
lying
in the integer range \(1\) to \(M\).
For example, setting zero = -c(2,
3)
for a matrix response of 4 columns with
zinegbinomial
(which usually
has \(G = M = 3\) for a univariate response)
would be equivalent to
zero = c(2, 3, 5, 6, 8, 9, 11, 12)
.
This example has \(M = 12\).
Note that if zerovec
contains
negative values then their absolute values
should be elements from the set 1:G
.
Note: zero
may have positive and
negative values, for example, setting
zero = c(-2, 3)
in the above example
would be equivalent to
zero = c(2, 3, 5, 8, 11)
.
The argument zero
also accepts
a character vector (for VGAM
1.0-1 onwards). Each value is fed into
grep
with fixed =
TRUE
, meaning that wildcards "*"
are not useful. See the example below---all
the variants work; those with LOCAT
issue a warning that that value is unmatched.
Importantly, the parameter names are
c("location1", "scale1", "location2",
"scale2")
because there are 2 responses. Yee
(2015) described zero
for only numerical
input. Allowing character input is particularly
important when the number of parameters cannot
be determined without having the actual data
first. For example, with time series data,
an ARMA(\(p\),\(q\)) process might have
parameters \(\theta_1,\ldots,\theta_p\) which
should be intercept-only by default. Then
specifying a numerical default value for
zero
would be too difficult (there
are the drift and scale parameters too).
However, it is possible with the character
representation: zero = "theta"
would
achieve this. In the future, most VGAM
family functions might be converted
to the character representation---the
advantage being that it is more readable.
When programming a VGAM family function
that allows character input, the variable
predictors.names
must be assigned
correctly.
If the constraints
argument is used
then the family function's zero
argument (if it exists) needs to be set to
NULL
. This avoids what could be a
probable contradiction. Sometimes setting
other arguments related to constraint matrices
to FALSE
is also a good idea, e.g.,
parallel = FALSE
,
exchangeable = FALSE
.
Grid-search initial values can be inputted
using an argument beginning with "g"
,
e.g., "gsigma"
, "gshape"
and
"gscale"
. If argument isigma
is inputted then that has precedence over
gsigma
, etc.
If the grid search is 2-dimensional then it
is advisable not to make the vectors too long
as a nested for
loop may be used.
Ditto for 3-dimensions etc. Sometimes a
".mux"
is added as a suffix, e.g.,
gshape.mux
; this means that the grid
is created relatively and not absolutely,
e.g., its values are multipled by some single
initial estimate of the parameter in order
to create the grid on an absolute scale.
Some family functions have an argument
called gprobs.y
. This is fed
into the probs
argument of
quantile
in order to obtain some values of central
tendency of the response, i.e., some spread
of values in the middle. when imethod = 1
to obtain an initial value for the mean
Some family functions have an argument called
iprobs.y
, and if so, then these values
can overwrite gprobs.y
.
Logical.
Constrain all the means to be equal?
This type of argument is simpler than
parallel
because only a single
TRUE
or FALSE
can be assigned
and
not a formula. Thus if TRUE
then it
will be enforced over all variables.
A logical, or a simple formula specifying
which terms have equal/unequal coefficients.
The formula must be simple, i.e., additive
with simple main effects terms. Interactions
and nesting etc. are not handled. To handle
complex formulas use the constraints
argument (of vglm
etc.);
however, there is a lot more setting up
involved and things will not be as convenient.
Here are some examples.
1. parallel = TRUE ~ x2 + x5
means
the parallelism assumption
is only applied to \(X_2\), \(X_5\) and
the intercept.
2. parallel = TRUE ~ -1
and parallel = TRUE ~ 0
mean the parallelism assumption
is applied to no variables at all.
Similarly,
parallel = FALSE ~ -1
and
parallel = FALSE ~ 0
mean the parallelism assumption
is applied to all the variables
including the intercept.
3. parallel = FALSE ~ x2 - 1
and parallel = FALSE ~ x2 + 0
applies the parallelism constraint to all terms
(including the intercept) except for \(X_2\).
4. parallel = FALSE ~ x2 * x3
probably will not work. Instead, expand it
out manually to get
parallel = FALSE ~ x2 + x3 + x2:x3
,
and that should work.
That's because the main formula processes or
expands the "*"
operator but parallel
does not.
5. To check whether parallel
has done
what was expected, type
coef(fit, matrix = TRUE)
or
constraints(fit)
for confirmation.
This argument is common in VGAM family
functions for categorical responses, e.g.,
cumulative
, acat
,
cratio
, sratio
.
For the proportional odds model
(cumulative
) having parallel
constraints applied to each explanatory
variable (except for the intercepts) means the
fitted probabilities do not become negative
or greater than 1. However this parallelism
or proportional-odds assumption ought to
be checked.
Some VGAM family functions use
simulation to obtain an approximate expected
information matrix (EIM). For those that
do, the nsimEIM
argument specifies
the number of random variates used per
observation; the mean of nsimEIM
random
variates is taken. Thus nsimEIM
controls the accuracy and a larger value
may be necessary if the EIMs are not
positive-definite. For intercept-only models
(y ~ 1)
the value of nsimEIM
can be smaller (since the common value used
is also then taken as the mean over the
observations), especially if the number of
observations is large.
Some VGAM family functions provide
two algorithms for estimating the EIM.
If applicable, set nsimEIM = NULL
to choose the other algorithm.
An integer with value 1
or 2
or 3
or ... which specifies the
initialization method for some parameters or
a specific parameter. If failure to converge
occurs try the next higher value, and continue
until success.
For example, imethod = 1
might
be the method of moments, and
imethod = 2
might be another method.
If no value of imethod
works then
it will be necessary to use arguments such
as isigma
. For many VGAM
family functions it is advisable to try this
argument with all possible values to safeguard
against problems such as converging to a local
solution. VGAM family functions with
this argument usually correspond to a model
or distribution that is relatively hard to
fit successfully, therefore care is needed
to ensure the global solution is obtained.
So using all possible values that this
argument supplies is a good idea.
VGAM family functions such
genpoisson2
recycle
imethod
to be of length 2 corresponding
to the 2 parameters. In the future, this
feature will be extended to other family
functions to confer more flexibility.
Formula.
Using applied to models with \(M=2\).
Specifies the terms for \(\eta_2\)
and the other terms belong to \(\eta_1\).
It is a way to partition the X
matrix into two sets of covariates,
where they are assigned to each \(\eta_j\)
separately.
This argument sets up constraint matrices
rbind(0, 1)
for terms in Form2
and rbind(1, 0)
for
setdiff(formula, Form2)
so to speak.
Note that
sometimes this argument is only accessed
if vfl = TRUE
.
Arguments such as Form1
and
Form3
are also possible in
VGAM family functions because
the \(\eta_j\) which is likely to be
modelled more simply is
chosen for convenience.
A single logical.
This stands for
variance--variance factored loglinear
(VFL)
model.
If TRUE
then usually
some other argument such
as Form2
or parallel
is used to partition the main
vglm
formula
into two sets of covariates.
For some families
such as negbinomial
this enables overdispersion to be modelled
conveniently via a loglinear model,
given the mean.
It is necessary to read the online help
regarding each VGAM family function
because each one may different from others.
To fit some VFL models it is necessary to
make a copy of existing covariates by
creating new variable names and then
adding it to the main formula.
A good question is:
why is vfl
necessary?
Wouldn't Form2
be sufficient?
Setting vfl = TRUE
enables
some internal checking such
as avoiding conflicts.
For example, it is often necessary to set
zero = NULL
and
parallel = FALSE
, otherwise
there would be contradictions.
Character.
Type of fitted value returned by
the fitted()
methods function.
The first choice is always the default.
The available choices depends on what kind
of family function it is. Using the first
few letters of the chosen choice is okay.
See fittedvlm
for more details.
The choice "Qlink"
refers to
quantile-links, which was introduced in
December 2018 in VGAMextra 0.0-2
for several 1-parameter distributions.
Here, either the loglink
or logitlink
or
identitylink
of the quantile
is the link function (and the choice is
dependent on the support of the distribution),
and link functions end in "Qlink"
.
A limited amount of support is provided
for such links, e.g., fitted(fit)
are the fitted quantiles, which is the same
as predict(fit, type = "response")
.
However, fitted(fit, percentiles = 77)
will not work.
Numeric vector, with values between 0 and
100 (although it is not recommended that
exactly 0 or 100 be inputted). Used only
if type.fitted = "quantiles"
or type.fitted = "percentiles"
,
then this argument specifies the values of
these quantiles. The argument name tries
to reinforce that the values lie between 0
and 100. See fittedvlm
for
more details.
Numeric, with values in (0, 1).
The probabilites that define quantiles with
respect to some vector, usually an x
or
y
of some sort. This is used to create
two subsets of data corresponding to `low'
and `high' values of x or y. Each value is
separately fed into the probs
argument
of quantile
.
If the data set size is small then it may be
necessary to increase/decrease slightly the
first/second values respectively.
Logical.
This stands for the ordering: location,
scale and shape. Should the ordering
of the parameters be in this order?
Almost all VGAM family functions
have this order by default, but in order
to match the arguments of existing R
functions, one might need to set
lss = FALSE
. For example, the arguments of
weibullR
are scale and shape,
whereas rweibull
are
shape and scale. As a temporary measure
(from VGAM 0.9-7 onwards but prior to
version 1.0-0), some family functions such
as sinmad
have an lss
argument without a default. For these,
setting lss = FALSE
will work.
Later, lss = TRUE
will be the default.
Be careful for the dpqr
-type functions,
e.g., rsinmad
.
Thresholds is another name for the
intercepts, e.g., for categorical models.
They may be constrained by functions such as
CM.equid
and
CM.qnorm
.
The string "CM."
and
the Thresh
argument is pasted and
then that function is called to obtain the
constraint matrix for the intercept term.
So
Thresh = "free"
,
Thresh = "equid"
,
Thresh = "qnorm"
,
Thresh = "symm0"
,
Thresh = "symm1"
etc.
are possibilities.
Families that use this include
multinomial
,
cratio
,
sratio
,
cumulative
,
acat
.
Logical.
Should white spaces (" "
) be used in the
labelling of the linear/additive predictors?
Setting TRUE
usually results in more
readability but it occupies more columns of
the output.
Logical.
Should the observed information matrices
(OIMs) be used for the working weights?
In general, setting oim = TRUE
means
the Newton-Raphson algorithm,
and oim = FALSE
means Fisher-scoring.
The latter uses the EIM, and is usually recommended.
If oim = TRUE
then nsimEIM
is ignored.
Numeric, a value in \([0, 1]\).
Experimental argument for allowing a mixture
of Newton-Raphson and Fisher scoring.
The working weights are taken as a linear
combination of the two. If nrfs = 0
then Newton-Raphson is used, i.e., the OIM
is wholly used. If nrfs = 1
then
Fisher scoring is used, i.e., the EIM is
wholly used. If convergence is successful
then one might expect Newton-Raphson to
be faster than Fisher scoring because the
former has an order-2 convergence rate
while the latter has a linear rate.
,
Logical.
Some VGAM family functions allow
a multivariate or vector response.
If so, then usually the response is a
matrix with columns corresponding to the
individual response variables. They are
all fitted simultaneously. Arguments such
as parallel
may then be useful
to allow for relationships between the
regressions of each response variable.
If multiple.responses = TRUE
then
sometimes the response is interpreted
differently, e.g., posbinomial
chooses the first column of a matrix response
as success and combines the other columns as
failure, but when
multiple.responses = TRUE
then each column of the response
matrix is the number of successes and the
weights
argument is of the same
dimension as the response and contains the
number of trials.
This argument should be generally ignored.
Logical.
Some VGAM family functions that handle
multiple responses have arguments that allow
input to be fed in which affect all the
responses, e.g., imu
for initalizing
a mu
parameter. In such cases it is
sometime more convenient to input one value
per response by setting byrow.arg = TRUE
;
then values are recycled in order to
form a matrix of the appropriate dimension.
This argument matches byrow
in
matrix
; in fact it is
fed into such using
matrix(..., byrow = byrow.arg)
.
This argument has no effect when there is
one response.
Shrinkage factor \(s\) used for obtaining
initial values.
Numeric, between 0 and 1.
In general, the formula used is something like
\(s \mu + (1-s) y\)
where \(\mu\) is a measure of
central tendency such as a weighted mean or
median, and \(y\) is the response vector.
For example, the initial values are slight
perturbations of the mean towards the actual
data. For many types of models this method
seems to work well and is often reasonably
robust to outliers in the response. Often
this argument is only used if the argument
imethod
is assigned a certain value.
An integer-valued vector specifying
which linear/additive predictors have no
intercepts. Any values must be from the
set {1,2,...,\(M\)}. A value of
NULL
means no such constraints.
Logical.
Some VGAM family functions will allow
bias-reduction based on the work by Kosmidis
and Firth. Sometimes half-stepping is a good
idea; set stepsize = 0.5
and monitor
convergence by setting trace = TRUE
.
Some VGAM family functions
(such as normal.vcm
)
implement models with
potentially lots of parameter link functions.
These two arguments allow many such links and extra arguments
to be inputted more easily.
One has something like
link.list = list
("(Default)" = "identitylink", x2 = "loglink", x3 = "logofflink")
and
earg.list = list
("(Default)" = list(), x2 = list(), x3 = "list(offset = -1)")
.
Then any unnamed terms will have the default
link with its corresponding extra argument.
Note: the multilogitlink
link
is also possible, and if so, at least two
instances of it are necessary. Then the last
term is the baseline/reference group.
The zero
argument is supplied for
convenience but conflicts can arise with other
arguments, e.g., the constraints
argument of vglm
and
vgam
. See Example 5 below
for an example. If not sure, use, e.g.,
constraints(fit)
and
coef(fit, matrix = TRUE)
to check the result of a fit fit
.
The arguments zero
and
nointercept
can be inputted with values
that fail. For example,
multinomial(zero = 2, nointercept = 1:3)
means the second linear/additive predictor is
identically zero, which will cause a failure.
Be careful about the use of other
potentially contradictory constraints, e.g.,
multinomial(zero = 2, parallel = TRUE ~ x3)
.
If in doubt, apply constraints()
to the fitted object to check.
VGAM family functions with the
nsimEIM
may have inaccurate working
weight matrices. If so, then the standard
errors of the regression coefficients
may be inaccurate. Thus output from
summary(fit)
, vcov(fit)
,
etc. may be misleading.
Changes relating to the lss
argument
have very important consequences and users
must beware. Good programming style is
to rely on the argument names and not on
the order.
T. W. Yee
Full details will be given in documentation
yet to be written, at a later date!
A general recommendation is to set
trace = TRUE
whenever any model fitting
is done, since monitoring convergence is
usually very informative.
Yee, T. W. (2015). Vector Generalized Linear and Additive Models: With an Implementation in R. New York, USA: Springer.
Kosmidis, I. and Firth, D. (2009). Bias reduction in exponential family nonlinear models. Biometrika, 96, 793--804.
Miranda-Soberanis, V. F. and Yee, T. W. (2019). New link functions for distribution--specific quantile regression based on vector generalized linear and additive models. Journal of Probability and Statistics, 5, 1--11.
Links
,
vglm
,
vgam
,
vglmff-class
,
UtilitiesVGAM
,
multilogitlink
,
multinomial
,
VGAMextra.
# Example 1
cumulative()
cumulative(link = "probitlink", reverse = TRUE, parallel = TRUE)
# Example 2
wdata <- data.frame(x2 = runif(nn <- 1000))
wdata <- transform(wdata,
y = rweibull(nn, shape = 2 + exp(1 + x2), scale = exp(-0.5)))
fit <- vglm(y ~ x2, weibullR(lshape = logofflink(offset = -2), zero = 2),
data = wdata)
coef(fit, mat = TRUE)
# Example 3; multivariate (multiple) response
if (FALSE) {
ndata <- data.frame(x = runif(nn <- 500))
ndata <- transform(ndata,
y1 = rnbinom(nn, exp(1), mu = exp(3+x)), # k is size
y2 = rnbinom(nn, exp(0), mu = exp(2-x)))
fit <- vglm(cbind(y1, y2) ~ x, negbinomial(zero = -2), ndata)
coef(fit, matrix = TRUE)
}
# Example 4
if (FALSE) {
# fit1 and fit2 are equivalent
fit1 <- vglm(ymatrix ~ x2 + x3 + x4 + x5,
cumulative(parallel = FALSE ~ 1 + x3 + x5), cdata)
fit2 <- vglm(ymatrix ~ x2 + x3 + x4 + x5,
cumulative(parallel = TRUE ~ x2 + x4), cdata)
}
# Example 5
udata <- data.frame(x2 = rnorm(nn <- 200))
udata <- transform(udata,
x1copy = 1, # Copy of the intercept
x3 = runif(nn),
y1 = rnorm(nn, 1 - 3*x2, sd = exp(1 + 0.2*x2)),
y2 = rnorm(nn, 1 - 3*x2, sd = exp(1)))
args(uninormal)
fit1 <- vglm(y1 ~ x2, uninormal, udata) # This is okay
fit2 <- vglm(y2 ~ x2, uninormal(zero = 2), udata) # This is okay
fit4 <- vglm(y2 ~ x2 + x1copy + x3,
uninormal(zero = NULL, vfl = TRUE,
Form2 = ~ x1copy + x3 - 1), udata)
coef(fit4, matrix = TRUE) # VFL model
# This creates potential conflict
clist <- list("(Intercept)" = diag(2), "x2" = diag(2))
fit3 <- vglm(y2 ~ x2, uninormal(zero = 2), data = udata,
constraints = clist) # Conflict!
coef(fit3, matrix = TRUE) # Shows that clist[["x2"]] was overwritten,
constraints(fit3) # i.e., 'zero' seems to override the 'constraints' arg
# Example 6 ('whitespace' argument)
pneumo <- transform(pneumo, let = log(exposure.time))
fit1 <- vglm(cbind(normal, mild, severe) ~ let,
sratio(whitespace = FALSE, parallel = TRUE), pneumo)
fit2 <- vglm(cbind(normal, mild, severe) ~ let,
sratio(whitespace = TRUE, parallel = TRUE), pneumo)
head(predict(fit1), 2) # No white spaces
head(predict(fit2), 2) # Uses white spaces
# Example 7 ('zero' argument with character input)
set.seed(123); n <- 1000
ldata <- data.frame(x2 = runif(n))
ldata <- transform(ldata, y1 = rlogis(n, loc = 5*x2, scale = exp(2)))
ldata <- transform(ldata, y2 = rlogis(n, loc = 5*x2, scale = exp(1*x2)))
ldata <- transform(ldata, w1 = runif(n))
ldata <- transform(ldata, w2 = runif(n))
fit7 <- vglm(cbind(y1, y2) ~ x2,
# logistic(zero = "location1"), # location1 is intercept-only
# logistic(zero = "location2"),
# logistic(zero = "location*"), # Not okay... all is unmatched
# logistic(zero = "scale1"),
# logistic(zero = "scale2"),
# logistic(zero = "scale"), # Both scale parameters are matched
logistic(zero = c("location", "scale2")), # All but scale1
# logistic(zero = c("LOCAT", "scale2")), # Only scale2 is matched
# logistic(zero = c("LOCAT")), # Nothing is matched
# trace = TRUE,
# weights = cbind(w1, w2),
weights = w1,
data = ldata)
coef(fit7, matrix = TRUE)
Run the code above in your browser using DataLab