loclda: Localized Linear Discriminant Analysis (LocLDA)

Description

A localized version of Linear Discriminant Analysis.

Usage

loclda(x, ...)
# S3 method for formula
loclda(formula, data, ..., subset, na.action)
# S3 method for default
loclda(x, grouping, weight.func = function(x) 1/exp(x), 
    k = nrow(x), weighted.apriori = TRUE, ...)
# S3 method for data.frame
loclda (x, ...)
# S3 method for matrix
loclda(x, grouping, ..., subset, na.action)

Value

A list of class loclda containing the following components:

call: The (matched) function call.
learn: Matrix containing the values of the explanatory variables for all train observations.
grouping: Factor specifying the class for each train observation.
weight.func: Value of the argument weight.func.
k: Value of the argument k.
weighted.apriori: Value of the argument weighted.apriori.

Arguments

formula: Formula of the form ‘groups ~ x1 + x2 + ...’.
data: Data frame from which variables specified in formula are to be taken.
x: Matrix or data frame containing the explanatory variables (required, if formula is not given).
grouping: (required if no formula principal argument is given.) A factor specifying the class for each observation.
weight.func: Function used to compute local weights. Must be finite over the interval [0,1]. See Details below.
k: Number of nearest neighbours used to construct localized classification rules. See Details below.
weighted.apriori: Logical: if TRUE, class prior probabilities are computed using local weights (see Details below). If FALSE, equal priors for all classes actually occurring in the train data are used.
subset: An index vector specifying the cases to be used in the training sample.
na.action: A function to specify the action to be taken if NAs are found. The default action is for the procedure to fail. An alternative is na.omit which leads to rejection of cases with missing values on any required variable.
...: Further arguments to be passed to loclda.default.

Author

Marc Zentgraf (marc-zentgraf@gmx.de) and Karsten Luebke (karsten.luebke@fom.de)

Details

This is an approach to apply the concept of localization described by Tutz and Binder (2005) to Linear Discriminant Analysis. The function loclda generates an object of class loclda (see Value below). As localization makes it necessary to build an individual decision rule for each test observation, this rule construction has to be handled by predict.loclda. For convenience, the rule building procedure is still described here.

To classify a test observation $x_s$, only the k nearest neighbours of $x_s$ within the train data are used. Each of these k train observations $x_i, i = 1,\dots,k$, is assigned a weight $w_i$ according to $$w_i = K\left(\frac{||x_i-x_s||}{d_k}\right), i=1,\dots,k$$ where K is the weighting function given by weight.func, $||x_i-x_s||$ is the euclidian distance of $x_i$ and $x_s$ and $d_k$ is the euclidian distance of $x_s$ to its $k$-th nearest neighbour. With these weights for each class $A_g, g=1,\dots,G$, its weighted empirical mean $\hat{\mu}_g$ and weighted empirical covariance matrix are computed. The estimated pooled (weighted) covariance matrix $\hat{\Sigma}$ is then calculated from the individual weighted empirical class covariance matrices. If weighted.apriori is TRUE (the default), prior class probabilities are estimated according to: $$prior_g := \frac{\sum_{i=1}^k \left(w_i \cdot I (x_i \in A_g)\right)}{\sum_{i=1}^k \left( w_i \right)}$$ where I is the indicator function. If FALSE, equal priors for all classes are used. In analogy to Linear Discriminant Analysis, the decision rule for $x_s$ is $$\hat{A} := argmax_{g \in 1,\dots,G} (posterior_g)$$ where $$posterior_g := prior_g \cdot \exp{\left( (-\frac{1}{2}) t(x_s-\hat{\mu}_g)\hat{\Sigma}^{-1}(x_s-\hat{\mu}_g)\right)} $$ If $posterior_g < 10^{(-150)} \forall g \in \{1,\dots,G\}$, $posterior_g$ is set to $\frac{1}{G}$ for all $g \in 1,\dots,G$ and the test observation $x_s$ is simply assigned to the class whose weighted mean has the lowest euclidian distance to $x_s$.

References

Tutz, G. and Binder, H. (2005): Localized classification. Statistics and Computing 15, 155-166.

Examples

Run this code

benchB3("lda")$l1co.error
benchB3("loclda")$l1co.error

Run the code above in your browser using DataLab