Learn R Programming

arules (version 1.7-6)

predict: Model Predictions

Description

Provides the method predict() for itemMatrix (e.g., transactions). Predicts the membership (nearest neighbor) of new data to clusters represented by medoids or labeled examples.

Usage

predict(object, ...)

# S4 method for itemMatrix predict(object, newdata, labels = NULL, blocksize = 200, ...)

Value

An integer vector of the same length as newdata containing the predicted labels for each element.

Arguments

object

clustered examples as an itemMatrix with cluster label specified in labels or medoids as an itemMatrix (use labels = NULL).

...

further arguments passed on to dissimilarity(). E.g., method.

newdata

an itemMatrix containing the objects to predict labels for.

labels

an integer vector containing the labels for the examples in object. The cluster labels need to be contiguous integers starting with 1.

blocksize

a numeric scalar indicating how much memory predict can use for big x and/or y (approx. in MB). 200 is only a crude approximation for 32-bit machines (64-bit architectures need double the blocksize in memory) and using the default Jaccard method for dissimilarity calculation. In general, reducing blocksize will decrease the memory usage but will increase the run-time.

Author

Michael Hahsler

See Also

Other proximity classes and functions: affinity(), dissimilarity(), proximity-classes

Examples

Run this code
data("Adult")

## sample
small <- sample(Adult, 500)
large <- sample(Adult, 5000)

## cluster a small sample and extract the cluster lael vector
d_jaccard <- dissimilarity(small)
hc <- hclust(d_jaccard)
l <-  cutree(hc, k=4)

## predict labels for a larger sample
labels <- predict(small, large, l)

## plot the profile of the 1. cluster
itemFrequencyPlot(large[labels == 1, itemFrequency(large) > 0.1])

Run the code above in your browser using DataLab