This function returns the k nearest indices and distances of each observation
knn.index.dist(
data,
TEST_data = NULL,
k = 5,
method = "euclidean",
transf_categ_cols = F,
threads = 1,
p = k
)
a list of length 2. The first sublist returns the indices and the second the distances of the k nearest neighbors for each observation. If TEST_data is NULL the number of rows of each sublist equals the number of rows in the train data. If TEST_data is not NULL the number of rows of each sublist equals the number of rows in the TEST data.
a data.frame or matrix
a data.frame or matrix (it can be also NULL)
an integer specifying the k-nearest-neighbors
a string specifying the method. Valid methods are 'euclidean', 'manhattan', 'chebyshev', 'canberra', 'braycurtis', 'pearson_correlation', 'simple_matching_coefficient', 'minkowski' (by default the order 'p' of the minkowski parameter equals k), 'hamming', 'mahalanobis', 'jaccard_coefficient', 'Rao_coefficient'
a boolean (TRUE, FALSE) specifying if the categorical columns should be converted to numeric or to dummy variables
the number of cores to be used in parallel (openmp will be employed)
a numeric value specifying the 'minkowski' order, i.e. if 'method' is set to 'minkowski'. This parameter defaults to 'k'
Lampros Mouselimis
This function takes a number of arguments and it returns the indices and distances of the k-nearest-neighbors for each observation. If TEST_data is NULL then the indices-distances for the train data will be returned, whereas if TEST_data is not NULL then the indices-distances for the TEST_data will be returned.
data(Boston)
X = Boston[, -ncol(Boston)]
out = knn.index.dist(X, TEST_data = NULL, k = 4, method = 'euclidean', threads = 1)
Run the code above in your browser using DataLab