compare, compare.kda.diag.cv, compare.kda.cv, compare.pda.cv: Comparisons for kernel and parametric discriminant analysis

Description

Comparisons for kernel and parametric discriminant analysis.

Usage

compare(x.group, est.group, by.group=FALSE)
compare.kda.cv(x, x.group, bw="plugin", prior.prob=NULL, Hstart,
    by.group=FALSE, trace=FALSE,...)
compare.kda.diag.cv(x, x.group, bw="plugin", prior.prob=NULL,
    by.group=FALSE, trace=FALSE, ...)
compare.pda.cv(x, x.group, type="quad", prior.prob=NULL,
    by.group=FALSE)

Arguments

Value

The functions create a comparison between the true group labels x.group and the estimated ones. It returns a list with fields
crosscross-classification table with the rows indicating the true group and the columns the estimated group
errormisclassification rate (MR)
In the case where we have test data that is independent of the training data, we supply the estimated group labels ext.group and $$\textrm{MR} = \frac{\textrm{number of points wrongly classified}}{\textrm{total number of points}}$$ In the case where we don't have independent test data e.g. we are classifying the training data set itself, then the cross validated estimate is more appropriate. See Silverman (1986). These are implemented as for kernel discriminant analysis as compare.kda.cv (full bandwidth selectors) and compare.kda.diag.cv (for diagonal bandwidth selectors), and compare.pda.cv for parametric discriminant analysis.
If by.group=FALSE then only the total MR rate is given. If it is set to TRUE, then the MR rates for each class are also given (estimated number in group divided by true number).

Details

If you have prior probabilities then set prior.prob to these. Otherwise prior.prob=NULL is the default i.e. use the sample proportions as estimates of the prior probabilities.

If trace=TRUE, a message is printed in the command line indicating that it's processing the i-th data item. This can be helpful since the cross-validated estimates may take a long time to execute completely.

The linear and quadratic discriminant analysers are based on lda and qda from the MASS library.

References

Silverman, B. W. (1986) Data Analysis for Statistics and Data Analysis. Chapman & Hall. London. Simonoff, J. S. (1996) Smoothing Methods in Statistics. Springer-Verlag. New York

Venables, W.N. & Ripley, B.D. (1997) Modern Applied Statistics with S-PLUS. Springer-Verlag. New York.

Examples

Run this code

### bivariate example - restricted iris dataset  

library(MASS)
data(iris)
ir <- iris[,c(1,2)]
ir.gr <- iris[,5]

compare.kda.cv(ir, ir.gr, bw="plug-in", pilot="samse")
compare.pda.cv(ir, ir.gr, type="quad")

Run the code above in your browser using DataLab