Learn R Programming

nomclust (version 2.1.6)

sm: Simple Matching Coefficient (SM)

Description

A function for calculation of a proximity (dissimilarity) matrix based on the SM similarity measure.

Usage

sm(data)

Arguments

data

A data.frame or a matrix with cases in rows and variables in colums.

Value

The function returns a dissimilarity matrix of the size n x n, where n is the number of objects in the original dataset in the argument data.

Details

The simple matching coefficient (Sokal, 1958) represents the simplest way of measuring similarity. It does not impose any weights. By a given variable, it assigns the value 1 in case of match and value 0 otherwise.

References

Boriah S., Chandola V., Kumar V. (2008). Similarity measures for categorical data: A comparative evaluation. In: Proceedings of the 8th SIAM International Conference on Data Mining, SIAM, p. 243-254.

Sokal R., Michener C. (1958). A statistical method for evaluating systematic relationships. In: Science bulletin, 38(22), The University of Kansas.

See Also

eskin, good1, good2, good3, good4, iof, lin, lin1, morlini, of, ve, vm.

Examples

Run this code
# NOT RUN {
# sample data
data(data20)

# dissimilarity matrix calculation
prox.sm <- sm(data20)

# }

Run the code above in your browser using DataLab