Fits ordinal cumulative probability models for continuous or ordinal
response variables, efficiently allowing for a large number of
intercepts by capitalizing on the information matrix being sparse.
Five different distribution functions are implemented, with the
default being the logistic (yielding the proportional odds
model). Penalized estimation and weights are also implemented, as in `[lrm.fit()]`.
The optimization method is Newton-Raphson with step-halving, or the Levenberg-Marquart method.
The latter has been shown to converge better when there are large offsets.
Execution time is is fast even for hundreds of thousands of intercepts. The limiting factor
is the number of intercepts times the number of columns of x
.
orm.fit(x=NULL, y, family=c("logistic","probit","loglog","cloglog","cauchit"),
offset, initial, opt_method=c('NR', 'LM'),
maxit=30L, eps=5e-4, gradtol=0.001, abstol=1e10,
minstepsize=0.01, tol=.Machine$double.eps, trace=FALSE,
penalty.matrix=NULL, weights=NULL, normwt=FALSE, scale=FALSE,
inclpen=TRUE, y.precision = 7, compstats=TRUE)
a list with the following components, not counting all the components produced by `orm.fit`:
calling expression
table of frequencies for y
in order of increasing y
vector of sorted unique values of y
vector with the following elements: number of observations used in the
fit, number of unique y
values, median y
from among the
observations used in the fit, maximum absolute value of first
derivative of log likelihood, model likelihood ratio chi-square, d.f.,
P-value, score chi-square and its P-value, Spearman's \(\rho\) rank
correlation between linear predictor and y
, the
Nagelkerke \(R^2\) index, the \(g\)-index, \(gr\) (the
\(g\)-index on the ratio scale), and \(pdm\) (the mean absolute
difference between 0.5 and the estimated probability that \(y\geq\)
the marginal median).
When penalty.matrix
is present, the \(\chi^2\),
d.f., and P-value are not corrected for the effective d.f.
set to TRUE
if convergence failed (and maxit>1
)
estimated parameters
see orm
-2 log likelihoods. When an offset variable is present, three deviances are computed: for intercept(s) only, for intercepts+offset, and for intercepts+offset+predictors. When there is no offset variable, the vector contains deviances for the intercept(s)-only model and the model with intercept(s) and predictors.
number of intercepts in model
the index of the middle (median) intercept used in
computing the linear predictor and var
the linear predictor using the first intercept
see above
see orm
design matrix with no column for an intercept
response vector, numeric, factor, or character. The ordering of levels
is assumed from factor(y)
.
a character value specifying the distribution family, corresponding to logistic (the
default), Gaussian, Cauchy, Gumbel maximum (\(exp(-exp(-x))\);
extreme value type I), and Gumbel minimum
(\(1-exp(-exp(x))\)) distributions. These are the cumulative
distribution functions assumed for \(Prob[Y \ge y | X]\). The
family
argument can be an unquoted or a quoted string,
e.g. family=loglog
or family="loglog"
. To use
a built-in family, the string must be one of the following
corresponding to the previous list: logistic, probit, loglog,
cloglog, cauchit
.
optional numeric vector containing an offset on the logit scale
vector of initial parameter estimates, beginning with the
intercepts. If initial
is not specified, the function computes
the overall score \(\chi^2\) test for the global null hypothesis of
no regression.
set to "LM"
to use Levenberg-Marquardt instead of the default Newton-Raphson
maximum no. iterations (default=30
).
difference in \(-2 log\) likelihood for declaring convergence.
Default is .0005
. This handles the case where the initial estimates are MLEs,
to prevent endless step-halving.
maximum absolute gradient before convergence can be declared. gradtol
is automatically scaled by n / 1000 since the gradient is proportional to the sample size.
maximum absolute change in parameter estimates from one iteration to the next before convergence can be declared; by default has no effect
used to specify when to abandon step-halving
Singularity criterion. Default is typically 2e-16
set to TRUE
to print -2 log likelihood, step-halving
fraction, change in -2 log likelihood, maximum absolute value of first
derivative, and max absolute change in parameter estimates at each iteration.
a self-contained ready-to-use penalty matrix - seelrm
a vector (same length as y
) of possibly fractional case weights
set to TRUE
to scale weights
so they sum to \(n\), the length of y
; useful for sample surveys as opposed to the default of frequency weighting
set to TRUE
to subtract column means and divide by
column standard deviations of x
before fitting, and to back-solve for the un-normalized covariance
matrix and regression coefficients. This can sometimes make the model
converge for very large
sample sizes where for example spline or polynomial component
variables create scaling problems leading to loss of precision when
accumulating sums of squares and crossproducts.
set to FALSE
to not include the penalty matrix in the Hessian when the Hessian is being computed on transformed x
, vs. adding the penalty after back-transforming. This should not matter.
When ‘y’ is numeric, values may need to be rounded
to avoid unpredictable behavior with unique()
with floating-point
numbers. Default is to 7 decimal places.
set to FALSE
to prevent the calculation of the vector of model statistics
Frank Harrell
Department of Biostatistics, Vanderbilt University
fh@fharrell.com
#Fit an additive logistic model containing numeric predictors age,
#blood.pressure, and sex, assumed to be already properly coded and
#transformed
#
# fit <- orm.fit(cbind(age,blood.pressure,sex), death)
Run the code above in your browser using DataLab