Separation occurs in binomial response models when a combination of the predictor variables perfectly predict a level of the response. In such a case the estimates of the coefficients for these variables diverge to (+/-)infinity, and the numerical algorithms typically fail. To anticipate such a problem, the fitting functions in spaMM
try to check for separation by default. The check may take much time, and is skipped if the “problem size” exceeds a threshold defined by spaMM.options(separation_max=<.>)
, in which case a message will tell users by how much they should increase separation_max
to force the check (its exact meaning and default value are subject to changes without notice but the default value aims to correspond to a separation check time of the order of 1s on the author's computer).
is_separated
is a convenient interface to procedures from the ROI
package, allowing them to be called explicitly by the user to check bootstrap samples (see Example in anova
).
is_separated.formula
is a variant (not yet a formal S3 method) that performs the same check, but using arguments similar to those of fitme(., family=binomial())
.
is_separated(x, y, verbose = TRUE, solver=spaMM.getOption("sep_solver"))
is_separated.formula(formula, ..., separation_max=spaMM.getOption("separation_max"),
solver=spaMM.getOption("sep_solver"))
Returns a boolean; TRUE
means there is (quasi-)separation. Screen output may give further information, such as pointing model terms that cause separation.
Design matrix for fixed effects.
Numeric response vector
A model formula
data
and possibly other arguments of a fitme
call. family
is ignored if present.
numeric: non-default value allow for easier local control of this spaMM option.
character: name of linear programming solver used to assess separation; passed to ROI_solve
's solver
argument. One can select another solver if the corresponding ROI plugin is installed.
Whether to print some messages (e.g., pointing model terms that cause separation) or not.
The method accessible by solver="glpk"
implements algorithms described by
Konis, K. 2007. Linear Programming Algorithms for Detecting Separated Data in Binary Logistic Regression Models. DPhil Thesis, Univ. Oxford. https://ora.ox.ac.uk/objects/uuid:8f9ee0d0-d78e-4101-9ab4-f9cbceed2a2a.
See also the 'safeBinaryRegression' and 'detectseparation' package.
set.seed(123)
d <- data.frame(success = rbinom(10, size = 1, prob = 0.9), x = 1:10)
is_separated.formula(formula= success~x, data=d) # FALSE
is_separated.formula(formula= success~I(success^2), data=d) # TRUE
Run the code above in your browser using DataLab