Fits a binary or ordinal logistic model for a given design matrix and response vector with no missing values in either. Ordinary or penalized maximum likelihood estimation is used.
lrm.fit(x, y, offset=0, initial, est, maxit=12, eps=.025,
tol=1e-7, trace=FALSE, penalty.matrix=NULL, weights=NULL,
normwt=FALSE, scale=FALSE)
a list with the following components:
calling expression
table of frequencies for y
in order of increasing y
vector with the following elements: number of observations used in the
fit, maximum absolute value of first
derivative of log likelihood, model likelihood ratio chi-square, d.f.,
P-value,
\(c\) index (area under ROC curve), Somers' \(D_{xy}\),
Goodman-Kruskal \(\gamma\), and Kendall's \(\tau_a\)
rank correlations
between predicted probabilities and observed response, the
Nagelkerke \(R^2\) index, 4 indexes computed by
R2Measures
, the Brier probability score with
respect to computing the probability that \(y >\) the mid level less
one, the \(g\)-index, \(gr\) (the \(g\)-index on the odds ratio
scale), and \(gp\) (the \(g\)-index on the probability scale using
the same cutoff used for the Brier score).
Probabilities are rounded to the nearest 0.002
in the computations or rank correlation indexes.
When penalty.matrix
is present, the \(\chi^2\),
d.f., and P-value are not corrected for the effective d.f.
set to TRUE
if convergence failed (and maxit>1
)
estimated parameters
estimated variance-covariance matrix (inverse of information matrix).
Note that in the case of penalized estimation, var
is not the
improved sandwich-type estimator (which lrm
does compute).
vector of first derivatives of log-likelihood
-2 log likelihoods. When an offset variable is present, three deviances are computed: for intercept(s) only, for intercepts+offset, and for intercepts+offset+predictors. When there is no offset variable, the vector contains deviances for the intercept(s)-only model and the model with intercept(s) and predictors.
vector of column numbers of X
fitted (intercepts are not counted)
number of intercepts in model
see above
design matrix with no column for an intercept
response vector, numeric, categorical, or character
optional numeric vector containing an offset on the logit scale
vector of initial parameter estimates, beginning with the intercept
indexes of x
to fit in the model (default is all columns of x
).
Specifying est=c(1,2,5)
causes columns 1,2, and 5 to have
parameters estimated. The score vector u
and covariance matrix var
can be used to obtain score statistics for other columns
maximum no. iterations (default=12
). Specifying maxit=1
causes logist to compute statistics at initial estimates.
difference in \(-2 log\) likelihood for declaring convergence.
Default is .025
. If the \(-2 log\) likelihood gets
worse by eps/10 while the maximum absolute first derivative of
\(-2 log\) likelihood is below 1e-9, convergence is still
declared. This handles the case where the initial estimates are MLEs,
to prevent endless step-halving.
Singularity criterion. Default is 1e-7
set to TRUE
to print -2 log likelihood, step-halving
fraction, change in -2 log likelihood, maximum absolute value of first
derivative, and vector of first derivatives at each iteration.
a self-contained ready-to-use penalty matrix - see lrm
a vector (same length as y
) of possibly fractional case weights
set to TRUE
to scale weights
so they sum to the length of
y
; useful for sample surveys as opposed to the default of
frequency weighting
set to TRUE
to subtract column means and divide by
column standard deviations of x
before fitting, and to back-solve for the un-normalized covariance
matrix and regresion coefficients. This can sometimes make the model
converge for very large
sample sizes where for example spline or polynomial component
variables create scaling problems leading to loss of precision when
accumulating sums of squares and crossproducts.
Frank Harrell
Department of Biostatistics, Vanderbilt University
fh@fharrell.com
#Fit an additive logistic model containing numeric predictors age,
#blood.pressure, and sex, assumed to be already properly coded and
#transformed
#
# fit <- lrm.fit(cbind(age,blood.pressure,sex), death)
Run the code above in your browser using DataLab