Learn R Programming

dbscan (version 1.1-5)

kNNdist: Calculate and plot the k-Nearest Neighbor Distance

Description

Fast calculation of the k-nearest neighbor distances in a matrix of points. The plot can be used to help find a suitable value for the eps neighborhood for DBSCAN. Look for the knee in the plot.

Usage

kNNdist(x, k, all = FALSE, ...)
kNNdistplot(x, k = 4, ...)

Arguments

x

the data set as a matrix or a dist object.

k

number of nearest neighbors used (use minPoints).

all

should a matrix with all used neighbors be returned?

...

further arguments are passed on to kNN.

Value

kNNdist returns a numeric vector with the distance to its k nearest neighbor. If all = TRUE then a matrix with k columns containing the distances to all 1st, 2nd, ..., k nearest neighbors is returned instead.

Details

See kNN for a discussion of the kd-tree related parameters.

See Also

kNN.

Examples

Run this code
# NOT RUN {
data(iris)
iris <- as.matrix(iris[,1:4])

## Find the 4-NN distance for each observation (see ?kNN
## for different search strategies)
kNNdist(iris, k=4)

## Get a matrix with distances to the 1st, 2nd, ..., 4th NN.
kNNdist(iris, k=4, all = TRUE)

## Produce a k-NN distance plot to determine a suitable eps for
## DBSCAN (the knee is around a distance of .5)
kNNdistplot(iris, k=4)

cl <- dbscan(iris, eps = .5, minPts = 4)
pairs(iris, col = cl$cluster+1L)
## Note: black are noise points
# }

Run the code above in your browser using DataLab