Several distance function are implemented in UBL package. The goal of having such a diversity of distance functions is to provide the users more flexibility regarding the distance used and also to provide distance fucntions that are able to deal with nominal and numeric features. The options available for the distance functions are as follows:
- data with only numeric features:
"Manhattan", "Euclidean", "Canberra", "Chebyshev", "p-norm";
- data with only nominal features:
"Overlap";
- data with both nominal and numeric features:
"HEOM", "HVDM".
When the "p-norm" is selected for the dist
parameter, it is also necessary to define the value of parameter p
. The value of parameter p
sets which "p-norm" will be used. For instance, if p
is set to 1, the "1-norm" (or Manhattan distance) is used, and if p
is set to 2, the "2-norm" (or Euclidean distance) is applied.
For more details regarding the distance functions implemented in UBL package please see the package vignettes.