Learn R Programming

mstknnclust (version 0.3.2)

mst.knn: Performs the MST-kNN clustering algorithm

Description

Performs the MST-kNN clustering algorithm which generates a clustering solution with automatic number of clusters determination using two proximity graphs: Minimal Spanning Tree (MST) and k-Nearest Neighbor (kNN) which are recursively intersected.

To create MST, Prim algorithm is used. To create kNN, distance.matrix passed as input is considered.

Usage

mst.knn(distance.matrix, suggested.k)

Value

A list with the elements

cnumber

A numeric value representing the number of clusters of the solution.

cluster

A named vector of integers from 1:cnumber representing the cluster to which each object is assigned.

partition

A partition matrix order by cluster where are shown the objects and the cluster where they are assigned.

csize

A vector with the cardinality of each cluster in the solution.

network

An object of class "igraph" as a network representing the clustering solution.

Arguments

distance.matrix

A numeric matrix or data.frame with equals numbers of rows and columns representing distances between objects to group.

suggested.k

It is an optional argument. A numeric value representing the suggested number of k-nearest neighbors to consider during the generating the kNN graph. Note that, due to the algorithm operation, this number may be different during the algorithm execution.

Author

Mario Inostroza-Ponta, Jorge Parraga-Alava, Pablo Moscato

Details

To see more details of how MST-kNN works refers to the quick guide.

References

Inostroza-Ponta, M. (2008). An Integrated and Scalable Approach Based on Combinatorial Optimization Techniques for the Analysis of Microarray Data. Ph.D. thesis, School of Electrical Engineering and Computer Science. University of Newcastle.

Examples

Run this code

set.seed(1987)

##load package
library("mstknnclust")

##Generates a data matrix of dimension 100X15

n=100; m=15

x <- matrix(runif(n*m, min = -5, max = 10), nrow=n, ncol=m)

##Computes a distance matrix of x.

library("stats")
d <- base::as.matrix(stats::dist(x, method="euclidean"))

##Performs MST-kNN clustering using euclidean distance.

results <- mst.knn(d)

## Visualizes the clustering solution

library("igraph")
plot(results$network, vertex.size=8,
     vertex.color=igraph::clusters(results$network)$membership,
     layout=igraph::layout.fruchterman.reingold(results$network, niter=10000),
     main=paste("MST-kNN \n Clustering solution \n Number of clusters=",results$cnumber,sep="" ))

Run the code above in your browser using DataLab