logitreg: Logistic regression models for assessing analogues/non-analogues

Description

Fits logistic regression models to each level of group to model the probability of two samples being analogues conditional upon the dissimilarity between the two samples.

Usage

logitreg(object, groups, k = 1, ...)
# S3 method for default
logitreg(object, groups, k = 1,
         biasReduced = FALSE, ...)
# S3 method for analog
logitreg(object, groups, k = 1, ...)
# S3 method for logitreg
summary(object, p = 0.9, ...)

Value

logitreg returns an object of class "logitreg"; a list whose components are objects returned by glm. See

glm for further details on the returned objects.

The components of this list take their names from group.

For summary.logitreg an object of class

"summary.logitreg", a data frame with summary statistics of the model fits. The components of this data frame are:

In, Out: The number of analogue and non-analogue dissimilarities analysed in each group,
Est.(Dij), Std.Err: Coefficient and its standard error for dissimilarity from the logit model,
Z-value, p-value: Wald statistic and associated p-value for each logit model.
Dij(p=?), Std.Err(Dij): The dissimilarity at which the posterior probability of two samples being analogues is equal to \(p\), and its standard error. These are computed using dose.p.

Arguments

object: for logitreg; a full dissimilarity matrix. For summary.logitreg an object of class "logitreg", the result of a call to logitreg.
groups: factor (or object that can be coerced to one) containing the group membership for each sample in object.
k: numeric; the k closest analogues to use in the model fitting.
biasReduced: logical; should Firth's method for bias reduced logistic regression be used to fit the models? If TRUE, model fits are performed via brglm. The default, FALSE, indicates that models will be fitted via the standard glm function.
p: probability at which to predict the dose needed.
...: arguments passed to other methods. These arguments are passed on to glm or brglm. See their respective helps pages for details. Note that logitreg sets internally the formula, data, and family arguments and hence can not be specified by the user.

Author

Gavin L. Simpson

Details

Fits logistic regression models to each level of group to model the probability of two samples being analogues (i.e. in the same group) conditional upon the dissimilarity between the two samples.

This function can be seen as a way of directly modelling the probability that two sites are analogues, conditional upon dissimilarity, that can also be done less directly using roc and bayesF.

Often, the number of true analogues in the training set is small, both in absolute terms and as a proportion of comparisons. Logistic regression is known to suffer from a small-sample bias. Firth's method of bias reduction is a general solution to this problem and is implemented in logitreg through the brglm package of Ioannis Kosmidis.

References

Firth, D. (1993). Bias reduction of maximum likelihood estimates. Biometrika 80, 27-38.

Examples

Run this code

## load the example data
data(swapdiat, swappH, rlgh)

## merge training and test set on columns
dat <- join(swapdiat, rlgh, verbose = TRUE)

## extract the merged data sets and convert to proportions
swapdiat <- dat[[1]] / 100
rlgh <- dat[[2]] / 100

## fit an analogue matching (AM) model using the squared chord distance
## measure - need to keep the training set dissimilarities
swap.ana <- analog(swapdiat, rlgh, method = "SQchord",
                   keep.train = TRUE)

## fit the ROC curve to the SWAP diatom data using the AM results
## Generate a grouping for the SWAP lakes
METHOD <- if (getRversion() < "3.1.0") {"ward"} else {"ward.D"}
clust <- hclust(as.dist(swap.ana$train), method = METHOD)
grps <- cutree(clust, 6)

## fit the logit models to the analog object
swap.lrm <- logitreg(swap.ana, grps)
swap.lrm

## summary statistics
summary(swap.lrm)

## plot the fitted logit curves
plot(swap.lrm, conf.type = "polygon")

## extract fitted posterior probabilities for training samples
## for the individual groups
fit <- fitted(swap.lrm)
head(fit)

## compute posterior probabilities of analogue-ness for the rlgh
## samples. Here we take the dissimilarities between fossil and
## training samples from the `swap.ana` object rather than re-
## compute them
pred <- predict(swap.lrm, newdata = swap.ana$analogs)
head(pred)

## Bias reduction
## fit the logit models to the analog object
swap.brlrm <- logitreg(swap.ana, grps, biasReduced = TRUE)
summary(swap.brlrm)

Run the code above in your browser using DataLab