Learn R Programming

stream (version 2.0-1)

DSC: Data Stream Clustering Base Class

Description

Abstract base classes for Data Stream Clustering (DSC). Concrete implementations are functions starting with DSC_ (RStudio use auto-completion with Tab to select one).

Usage

DSC(...)

get_centers(x, type = c("auto", "micro", "macro"), ...)

get_weights(x, type = c("auto", "micro", "macro"), scale = NULL, ...)

get_copy(x)

nclusters(x, type = c("auto", "micro", "macro"), ...)

get_microclusters(x, ...)

get_microweights(x, ...)

get_macroclusters(x, ...)

get_macroweights(x, ...)

Arguments

...

further parameter

x

a DSC object.

type

Return weights of micro- or macro-clusters in x. Auto uses the class of x to decide.

scale

a range (from, to) to scale the weights. Returns by default the raw weights.

Functions

  • get_centers(): Gets the cluster centers (micro- or macro-clusters) from a DSC object.

  • get_weights(): Get the weights of the clusters in the DSC (returns 1s if not implemented by the clusterer)

  • get_copy(): Create a Deep Copy of a DSC Object that contain reference classes (e.g., Java data structures for MOA).

  • nclusters(): Returns the number of micro-clusters from the DSC object.

  • get_microclusters(): Used as internal interface.

  • get_microweights(): Used as internal interface.

  • get_macroclusters(): Used as internal interface.

  • get_macroweights(): Used as internal interface.

Author

Michael Hahsler

Details

The DSC class cannot be instantiated (calling DSC() produces only a message listing the available implementations), but they serve as a base class from which other DSC classes inherit.

Data stream clustering has typically an

  • online clustering component (see DSC_Micro), and an

  • offline reclustering component (see DSC_Macro).

Class DSC provides several generic functions that can operate on all DSC subclasses. See Usage and Functions sections for methods. Additional, separately documented methods are:

  • update() adds new data points from a stream to a clustering.

  • predict() predicts the cluster assignment for new data points.

  • plot() plots cluster centers (see plot.DSC()).

get_centers() and get_weights() are typically overwritten by subclasses of DSC.

Since DSC objects often contain external pointers, regular saving and reading operations will fail. Use saveDSC() and readDSC() which will serialize the objects first appropriately.

See Also

Other DST: DSAggregate(), DSClassifier(), DSOutlier(), DSRegressor(), DST_SlidingWindow(), DST_WriteStream(), DST(), evaluate, predict(), stream_pipeline, update()

Other DSC: DSC_Macro(), DSC_Micro(), DSC_R(), DSC_SlidingWindow(), DSC_Static(), DSC_TwoStage(), animate_cluster(), evaluate.DSC, get_assignment(), plot.DSC(), predict(), prune_clusters(), read_saveDSC, recluster()

Examples

Run this code
DSC()

set.seed(1000)
stream <- DSD_Gaussians(k = 3, d = 2, noise = 0.05)
dstream <- DSC_DStream(gridsize = .1, gaptime = 100)
update(dstream, stream, 500)
dstream

# get micro-cluster centers
get_centers(dstream)

# get the micro-cluster weights
get_weights(dstream)

# get the number of clusters
nclusters(dstream)

# get the whole model as a data.frame
get_model(dstream)

# D-Stream also has macro-clusters
get_weights(dstream, type = "macro")
get_centers(dstream, type = "macro")

# plot the clustering result
plot(dstream, stream)
plot(dstream, stream, type = "both")

# predict macro clusters for new points (see predict())
points <- get_points(stream, n = 5)
points

predict(dstream, points, type = "macro")

Run the code above in your browser using DataLab