Learn R Programming

KernelKnn (version 1.1.5)

knn.index.dist: indices and distances of k-nearest-neighbors

Description

This function returns the k nearest indices and distances of each observation

Usage

knn.index.dist(
  data,
  TEST_data = NULL,
  k = 5,
  method = "euclidean",
  transf_categ_cols = F,
  threads = 1,
  p = k
)

Value

a list of length 2. The first sublist returns the indices and the second the distances of the k nearest neighbors for each observation. If TEST_data is NULL the number of rows of each sublist equals the number of rows in the train data. If TEST_data is not NULL the number of rows of each sublist equals the number of rows in the TEST data.

Arguments

data

a data.frame or matrix

TEST_data

a data.frame or matrix (it can be also NULL)

k

an integer specifying the k-nearest-neighbors

method

a string specifying the method. Valid methods are 'euclidean', 'manhattan', 'chebyshev', 'canberra', 'braycurtis', 'pearson_correlation', 'simple_matching_coefficient', 'minkowski' (by default the order 'p' of the minkowski parameter equals k), 'hamming', 'mahalanobis', 'jaccard_coefficient', 'Rao_coefficient'

transf_categ_cols

a boolean (TRUE, FALSE) specifying if the categorical columns should be converted to numeric or to dummy variables

threads

the number of cores to be used in parallel (openmp will be employed)

p

a numeric value specifying the 'minkowski' order, i.e. if 'method' is set to 'minkowski'. This parameter defaults to 'k'

Author

Lampros Mouselimis

Details

This function takes a number of arguments and it returns the indices and distances of the k-nearest-neighbors for each observation. If TEST_data is NULL then the indices-distances for the train data will be returned, whereas if TEST_data is not NULL then the indices-distances for the TEST_data will be returned.

Examples

Run this code

data(Boston)

X = Boston[, -ncol(Boston)]

out = knn.index.dist(X, TEST_data = NULL, k = 4, method = 'euclidean', threads = 1)

Run the code above in your browser using DataLab