DirichReg(formula, data, model = c("common", "alternative"),
subset, sub.comp, base, weights, control, verbosity = 0)
data.frame
containing independent and dependent variables"common"
($\alpha\mathrm{s}$) or "alternative"
($\mu/\phi$) parametrization is employed (see language
] function callcharacter
] used parametrizationcharacter
] components' namesnumeric
] vector with the number of parameters per set of predictorsnumeric
] number of componentsnumeric
] used componentsnumeric list
] sets of predictorsnumeric list
] sets of predictors (only for the alternative parametrization)numeric
] vector of single componentsnumeric
] base (only for the alternative parametrization)numeric
] vector of frequency weightsDirichletRegData
] the original responsedata.frame
] original datadata.frame
] used dataFormula
] expanded formulalanguage
] expression for generating the model framenumeric
] number of parametersnumeric
] named vector of parameterscharacter
] names of the parameterslist of matrices
] list containing alpha's, mu's, phi's for the observationsnumeric
] the log-likelihoodmatrix
] covariance-matrix of parameter estimatesmatrix
] (observed) Hessiannumeric
] vector of standard errorslist
] contains details about the optimization process provided by maxBFGS
and maxNR
formula
determines the used predictors.
The responses must be prepared by DR_data
and can be optionally stored in the object containing all covariates which is then specified as the argument data
.
(Although DR_data
in a formula works, it is only intended for testing purposes and may be removed at any time -- use at your own risk.)
There are two different parametrization (controlled by the argument model
, see below):
DV
is the DV ~ 1
.
We always have at least two dependent variables, so simple formulae as the one given above will be expanded to DV ~ 1 | 1 | 1
, because DV
hast three components.
Likewise, it is possible to specify a common set of predictors for all components, as in DV ~ p1 * p2
, where p1
and p2
are predictors.
If the covariates of the components shall differ, one has to set up a complete formula for each subcomposition, using |
as separators between the components, for example, DV ~ p1 | p1 + p2 | p1 * p2
will lead to a model where the first response in DV
will be modeled using p1
, the second will be predicted by p1 + p2
and the third by p1 * p2
.
Note that if you use the latter approach, the predictors have to be stated
explicitly for all response variables.
}
DV ~ 1
, which is expanded to DV ~ 1 | 1
. The part modeling the model = "alternative"
to use this parametrization!
The alternative parametrization consists of two parts: modeled expected values ($\mu$) and their base
argument in DR_data
or DirichReg
) and for the rest a set of predictors is used with a multinomial logit-link.
For precisions, a different set of predictors can be set up using a log-link.
DV ~ p1 * p2 | p1 + p2
will set up a model where the expected values are predicted by p1 * p2
and precision are modeled using p1 + p2
.
}
}
data
argument accepts a data.frame
that must include the dependent variable as a named element (see examples how to do this).
}
DR_data
, but can easily be changed using the argument base
which takes integer values from 1 to the maximum number of components.
If a data set contains a large number of components, of which only a few are relevant, the latter can be base
. The positioning of variables will necessarily change: the aggregated variable takes the first column and the others are appended in their order of selection.
}
subset
, the model can be fitted only to a part of the data, for more information about this functionality, see subset
.
Note that, unlike in glm
, weights
are not treated as prior weights, but as frequency weights!
}
control
argument, the settings passed to the optimizers can be altered.
This argument takes a named list.
To supply user-defined starting values, use control = list(sv=c(...))
and supply a vector containing initial values for all parameters.
Optimizer-specific options include the number of iterations (iterlim = 1000
) and convergence criteria for the BFGS- and NR-optimization ((tol1 = 1e-5
) and (tol2 = 1e-10
)).
Verbosity takes integer values from 0
to 4
.
0
, no information is printed (default).
1
prints information about 3 stages (preparation, starting values, estimation).
2
prints little information about optimization (verbosity
values greater than one are passed to print.default = verbosity - 1
of maxBFGS
and maxNR
).
3
prints more information about optimization.
4
prints all information about optimization.
}ALake <- ArcticLake
ALake$Y <- DR_data(ALake[,1:3])
# fit a quadratic Dirichlet regression models ("common")
res1 <- DirichReg(Y ~ depth + I(depth^2), ALake)
# fit a Dirichlet regression with quadratic predictor for the mean and
# a linear predictor for precision ("alternative")
res2 <- DirichReg(Y ~ depth + I(depth^2) | depth, ALake, model="alternative")
# test both models
anova(res1, res2)
res1
summary(res2)
Run the code above in your browser using DataLab