Learn R Programming

clv (version 0.3-2.1)

clv.Scatt: Average scattering for clusters - Internal Measure

Description

Function computes average scattering for clusters.

Usage

clv.Scatt(data, clust, dist="euclidean")

Arguments

data
numeric matrix or data.frame where columns correspond to variables and rows to observations
clust
integer vector with information about cluster id the object is assigned to. If vector is not integer type, it will be coerced with warning.
dist
choosen metric: "euclidean" (default value), "manhattan", "correlation"

Value

As result list with three values is returned.
Scatt
- average scattering for clusters value,
stdev
- standard deviation value,

Details

Let scatter for set X assigned as sigma(X) be defined as vector of variances computed for particular dimensions. Average scattering for clusters is defined as:

Scatt = (1/|C|) * sum{forall i in 1:|C|} ||sigma(Ci)||/||sigma(X)||

where:

|C|
- number of clusters,
i
- cluster id,
Ci
- cluster with id 'i',
X
- set with all objects,

Standard deviation is defined as:

stdev = (1/|C|) * sqrt( sum{forall i in 1:|C|} ||sigma(Ci)|| )

References

M. Haldiki, Y. Batistakis, M. Vazirgiannis On Clustering Validation Techniques, http://citeseer.ist.psu.edu/513619.html

See Also

clv.SD and clv.SDbw

Examples

Run this code
# load and prepare data
library(clv)
data(iris)
iris.data <- iris[,1:4]

# cluster data
agnes.mod <- agnes(iris.data) # create cluster tree 
v.pred <- as.integer(cutree(agnes.mod,5)) # "cut" the tree 

# compute Scatt index
scatt <- clv.Scatt(iris.data, v.pred)

Run the code above in your browser using DataLab