Implements Firth's bias-Reduced penalized-likelihood logistic regression.
logistf(
formula,
data,
pl = TRUE,
alpha = 0.05,
control,
plcontrol,
modcontrol,
firth = TRUE,
init,
weights,
na.action,
offset,
plconf = NULL,
flic = FALSE,
model = TRUE,
...
)
The object returned is of the class logistf
and has the following attributes:
the coefficients of the parameter in the fitted model.
the significance level (1- the confidence level) as specified in the input.
the column names of the design matrix
the variance-covariance-matrix of the parameters.
the number of degrees of freedom in the model.
a vector of the (penalized) log-likelihood of the restricted and the full models.
A vector of the number of iterations needed in the fitting process for the null and full model.
the number of observations.
the response-vector, i. e. 1 for successes (events) and 0 for failures.
the formula object.
the call object.
the model terms (column names of design matrix).
a vector with the linear predictor of each observation.
a vector with the predicted probability of each observation.
a vector with the diagonal elements of the Hat Matrix.
the convergence status at last iteration: a vector of length 3 with elements: last change in log likelihood, max(abs(score vector)), max change in beta at last iteration.
depending on the fitting method 'Penalized ML' or Standard ML'.} \item{method.ci}{the method in calculating the confidence intervals, i.e.
profile likelihood' or `Wald', depending on the argument pl and plconf.
the lower confidence limits of the parameter.
the upper confidence limits of the parameter.
the p-values of the specific parameters.
only if pl==TRUE: the number of iterations needed for each confidence limit.
only if pl==TRUE: the complete history of beta estimates for each confidence limit.
only if pl==TRUE: the convergence status (deviation of log likelihood from target value, last maximum change in beta) for each confidence limit.
a copy of the control parameters.
a copy of the modcontrol parameters.
logical, is TRUE if intercept was altered such that the predicted probabilities become unbiased while keeping all other coefficients constant. According to input of logistf.
if requested (the default), the model frame used.
information returned by model.frame on the special handling of NAs
A formula object, with the response on the left of the operator,
and the model terms on the right. The response must be a vector with 0 and 1 or FALSE
and
TRUE
for the outcome, where the higher value (1 or TRUE
) is modeled. It is
possible to include contrasts, interactions, nested effects, cubic or polynomial
splines and all S features as well, e.g. Y ~ X1*X2 + ns(X3, df=4).
A data.frame where the variables named in the formula can be found, i. e. the variables containing the binary response and the covariates.
Specifies if confidence intervals and tests should be based on the profile
penalized log likelihood (pl=TRUE
, the default) or on the Wald method (pl=FALSE
).
The significance level (1-\(\alpha\) the confidence level, 0.05 as default).
Controls iteration parameter. Default is control= logistf.control()
Controls Newton-Raphson iteration for the estimation of the profile
likelihood confidence intervals. Default is plcontrol= logistpl.control()
Controls additional parameter for fitting. Default is logistf.mod.control()
Use of Firth's penalized maximum likelihood (firth=TRUE
, default) or the
standard maximum likelihood method (firth=FALSE
) for the logistic regression.
Note that by specifying pl=TRUE
and firth=FALSE
(and probably a lower number
of iterations) one obtains profile likelihood confidence intervals for maximum likelihood
logistic regression parameters.
Specifies the initial values of the coefficients for the fitting algorithm
specifies case weights. Each line of the input data set is multiplied by the corresponding element of weights
a function which indicates what should happen when the data contain NAs
a priori known component to be included in the linear predictor
specifies the variables (as vector of their indices) for which profile likelihood confidence intervals should be computed. Default is to compute for all variables.
If TRUE
, intercept is altered such that the predicted probabilities become unbiased while
keeping all other coefficients constant (see Puhr et al, 2017)
If TRUE the corresponding components of the fit are returned.
Further arguments to be passed to logistf
Georg Heinze and Meinhard Ploner
logistf
is the main function of the package. It fits a logistic regression
model applying Firth's correction to the likelihood. The following generic methods are available for logistf's output
object: print, summary, coef, vcov, confint, anova, extractAIC, add1, drop1,
profile, terms, nobs, predict
. Furthermore, forward and backward functions perform convenient variable selection. Note
that anova, extractAIC, add1, drop1, forward and backward are based on penalized likelihood
ratios.
Firth D (1993). Bias reduction of maximum likelihood estimates. Biometrika 80, 27-38. Heinze G, Schemper M (2002). A solution to the problem of separation in logistic regression. Statistics in Medicine 21: 2409-2419.
Heinze G, Ploner M (2003). Fixing the nonconvergence bug in logistic regression with SPLUS and SAS. Computer Methods and Programs in Biomedicine 71: 181-187.
Heinze G, Ploner M (2004). Technical Report 2/2004: A SAS-macro, S-PLUS library and R package to perform logistic regression without convergence problems. Section of Clinical Biometrics, Department of Medical Computer Sciences, Medical University of Vienna, Vienna, Austria. http://www.meduniwien.ac.at/user/georg.heinze/techreps/tr2_2004.pdf
Heinze G (2006). A comparative investigation of methods for logistic regression with separated or nearly separated data. Statistics in Medicine 25: 4216-4226.
Puhr R, Heinze G, Nold M, Lusa L, Geroldinger A (2017). Firth's logistic regression with rare events: accurate effect estimates and predictions? Statistics in Medicine 36: 2302-2317.
Venzon DJ, Moolgavkar AH (1988). A method for computing profile-likelihood based confidence intervals. Applied Statistics 37:87-94.
add1.logistf()
, anova.logistf()
data(sex2)
fit<-logistf(case ~ age+oc+vic+vicl+vis+dia, data=sex2)
summary(fit)
nobs(fit)
drop1(fit)
plot(profile(fit,variable="dia"))
extractAIC(fit)
fit1<-update(fit, case ~ age+oc+vic+vicl+vis)
extractAIC(fit1)
anova(fit,fit1)
data(sexagg)
fit2<-logistf(case ~ age+oc+vic+vicl+vis+dia, data=sexagg, weights=COUNT)
summary(fit2)
# simulated SNP example
set.seed(72341)
snpdata<-rbind(
matrix(rbinom(2000,2,runif(2000)*0.3),100,20),
matrix(rbinom(2000,2,runif(2000)*0.5),100,20))
colnames(snpdata)<-paste("SNP",1:20,"_",sep="")
snpdata<-as.data.frame(snpdata)
snpdata$case<-c(rep(0,100),rep(1,100))
fitsnp<-logistf(data=snpdata, formula=case~1, pl=FALSE)
add1(fitsnp, scope=paste("SNP",1:20,"_",sep=""), data=snpdata)
fitf<-forward(fitsnp, scope = paste("SNP",1:20,"_",sep=""), data=snpdata)
fitf
Run the code above in your browser using DataLab