Learn R Programming

nomclust (version 2.1.6)

eskin: Eskin (ES) Measure

Description

A function for calculation of a proximity (dissimilarity) matrix based on the ES similarity measure.

Usage

eskin(data)

Arguments

data

A data.frame or a matrix with cases in rows and variables in colums.

Value

The function returns a dissimilarity matrix of the size n x n, where n is the number of objects in the original dataset in the argument data.

Details

The Eskin similarity measure was proposed by Eskin et al. (2002) and examined by Boriah et al., (2008). It is constructed to assign higher weights to mismatches on variables with more categories.

References

Boriah S., Chandola V., Kumar V. (2008). Similarity measures for categorical data: A comparative evaluation. In: Proceedings of the 8th SIAM International Conference on Data Mining, SIAM, p. 243-254.

Eskin E., Arnold A., Prerau M., Portnoy L. and Stolfo S. (2002). A geometric framework for unsupervised anomaly detection. In D. Barbara and S. Jajodia (Eds): Applications of Data Mining in Computer Security, p. 78-100. Norwell: Kluwer Academic Publishers.

See Also

good1, good2, good3, good4, iof, lin, lin1, morlini, of, sm, ve, vm.

Examples

Run this code
# NOT RUN {
# sample data
data(data20)

# dissimilarity matrix calculation
prox.eskin <- eskin(data20)
# }

Run the code above in your browser using DataLab