When J.H. Friedman introduced the Regularized Discriminant Analysis
(rda
) in 1989, he used artificially generated data
to test the procedure and to examine its performance in comparison to
Linear and Quadratic Discriminant Analysis
(see also lda
and qda
).
6 different settings were considered to demonstrate potential strengths
and weaknesses of the new method:
equal spherical covariance matrices,
unequal spherical covariance matrices,
equal, highly ellipsoidal covariance matrices with mean
differences in low-variance subspace,
equal, highly ellipsoidal covariance matrices with mean
differences in high-variance subspace,
unequal, highly ellipsoidal covariance matrices with zero mean
differences and
unequal, highly ellipsoidal covariance matrices with nonzero mean
differences.
For each of the 6 settings data was generated with 6, 10, 20 and 40
variables.
Classification performance was then measured by repeatedly creating
training-datasets of 40 observations and estimating the
misclassification rates by test sets of 100 observations.
The number of classes is always 3, class labels are assigned randomly
(with equal probabilities) to observations, so the contributions of
classes to the data differs from dataset to dataset. To make sure
covariances can be estimated at all, there are always at least two
observations from each class in a dataset.