The Kennard--Stone algorithm allows to select samples with a uniform
distribution over the predictor space (Kennard and Stone, 1969).
It starts by selecting the pair of points that are the farthest apart.
They are assigned to the calibration set and removed from the list of points.
Then, the procedure assigns remaining points to the calibration set
by computing the distance between each unassigned points
i_0i_0 and selected points ii
and finding the point for which:
d_selected = _i_0(_i(d_i,i_0))d_sel ected = _i_0(_i(d_i,i0))
This essentially selects point i_0i_0 which is the farthest apart from its
closest neighbors ii in the calibration set.
The algorithm uses the Euclidean distance to select the points. However,
the Mahalanobis distance can also be used. This can be achieved by performing
a PCA on the input data and computing the Euclidean distance on the truncated
score matrix according to the following definition of the Mahalanobis HH
distance:
H_ij^2 = _a=1^A ( t_ia - t_ja)^2 / _aH_ij^2 = sum_a=1^A (hat t_ia - hat t_ja)^2 / hat lambda_a
where t_iahatt_ia is the a^tha^th principal component
score of point ii, t_jahatt_ja is the
corresponding value for point jj,
_ahat lambda_a is the eigenvalue of principal
component aa and AA is the number of principal components
included in the computation.
When the group
argument is used, the sampling is conducted in such a
way that at each iteration, when a single sample is selected, this sample
along with all the samples that belong to its group, are assigned to the
final calibration set. In this respect, at each iteration, the algorithm
will select one sample (in case that sample is the only one in that group)
or more to the calibration set. This also implies that the argument k
passed to the function will not necessary reflect the exact number of samples
selected. For example, if k = 2
and if the first sample identified
belongs to with group of 5 samples and the second one belongs to a group with
10 samples, then, the total amount of samples retrieved by the
function will be 15.