sep_dist: Distance based on separation of clusters

Description

The separation between clusters is defined by the minimum distances of a point in the cluster to a point in another cluster. The number of clusters are provided. If not, the hierarchical clustering method is used to obtain the clusters. The separation between the clusters for dataset X is calculated. Same is done for dataset PX. An euclidean distance is then calculated between these separation for X and PX.

Usage

sep_dist(X, PX, clustering = FALSE, nclust = 3, type = "separation")

Value

distance between X and PX

Arguments

X: a data.frame with two or three columns, the first two columns providing the dataset
PX: a data.frame with two or three columns, the first two columns providing the dataset
clustering: LOGICAL; if TRUE, the third column is used as the clustering variable, by default FALSE
nclust: the number of clusters to be obtained by hierarchical clustering, by default nclust = 3
type: character string to specify which measure to use for distance, see ?cluster.stats for details

Examples

Run this code

if(require('fpc')) {
with(mtcars, sep_dist(data.frame(wt, mpg, as.numeric(as.factor(mtcars$cyl))),
              data.frame(sample(wt), mpg, as.numeric(as.factor(mtcars$cyl))),
              clustering = TRUE))
}

if (require('fpc')) {
with(mtcars, sep_dist(data.frame(wt, mpg, as.numeric(as.factor(mtcars$cyl))),
             data.frame(sample(wt), mpg, as.numeric(as.factor(mtcars$cyl))),
             nclust = 3))
}

Run the code above in your browser using DataLab