Learn R Programming

arules (version 1.0-12)

dissimilarity: Dissimilarity Computation

Description

Provides the generic function dissimilarity and the S4 methods to compute and returns distances for binary data in a matrix, transactions or associations.

Usage

dissimilarity(x, y = NULL, method = NULL, args = NULL, ...)
## S3 method for class 'itemMatrix':
dissimilarity(x, y = NULL, method = NULL, args = NULL,
	which = "transactions")
## S3 method for class 'associations':
dissimilarity(x, y = NULL, method = NULL, args = NULL,
	which = "transactions")
## S3 method for class 'matrix':
dissimilarity(x, y = NULL, method = NULL, args = NULL)

Arguments

x
the set of elements (e.g., matrix, itemMatrix, transactions, itemsets, rules).
y
NULL or a second set to calculate cross dissimilarities.
method
the distance measure to be used. Implemented measures are (defaults to "jaccard"): [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object] For associations the following additio
args
a list of additional arguments for the methods.
which
a character string indicating if the dissimilarity should be calculated between transactions (default) or items (use "items").
...
further arguments.

Value

  • returns an object of class dist.

References

Sneath, P. H. A. (1957) Some thoughts on bacterial classification. Journal of General Microbiology 17, pages 184--200. Sokal, R. R. and Michener, C. D. (1958) A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin 38, pages 1409--1438. Dice, L. R. (1945) Measures of the amount of ecologic association between species. Ecology 26, pages 297--302. Charu C. Aggarwal, Cecilia Procopiuc, and Philip S. Yu. (2002) Finding localized associations in market basket data. IEEE Trans. on Knowledge and Data Engineering 14(1):51--62.

Toivonen, H., Klemettinen, M., Ronkainen, P., Hatonen, K. and Mannila H. (1995) Pruning and grouping discovered association rules. In Proceedings of KDD'95.

Gupta, G., Strehl, A., and Ghosh, J. (1999) Distance based clustering of association rules. In Intelligent Engineering Systems Through Artificial Neural Networks (Proceedings of ANNIE 1999), pages 759-764. ASME Press.

See Also

affinity, dist-class, itemMatrix-class, associations-class.

Examples

Run this code
## cluster items in Groceries with support > 5\%
data("Groceries")

s <- Groceries[,itemFrequency(Groceries)>0.05]
d_jaccard <- dissimilarity(s, which = "items")
plot(hclust(d_jaccard, method = "ward"))



## cluster transactions for a sample of Adult
data("Adult")
s <- sample(Adult, 200) 

##  calculate Jaccard distances and do hclust
d_jaccard <- dissimilarity(s)
plot(hclust(d_jaccard))

## calculate affinity-based distances and do hclust
d_affinity <- dissimilarity(s, method = "affinity")
plot(hclust(d_affinity))


## cluster rules
rules <- apriori(Adult, parameter=list(support=0.3))
rules <- subset(rules, subset = lift > 2)

## use affinity
## we need to supply the item affinities from the dataset (sample)
d_affinity <- dissimilarity(rules, method = "affinity", 
  args = list(affinity = affinity(s)))
plot(hclust(d_affinity))

## use gupta
d_gupta <- dissimilarity(rules, method = "gupta", args=list(trans=Adult))
plot(hclust(d_gupta))

Run the code above in your browser using DataLab