Learn R Programming

KernelKnn

The KernelKnn package extends the simple k-nearest neighbors algorithm by incorporating numerous kernel functions and a variety of distance metrics. The package takes advantage of 'RcppArmadillo' to speed up the calculation of distances between observations. More details on the functionality of KernelKnn can be found in the blog-post and in the package Vignettes ( scroll down for information on how to use the docker image ).

To install the package from CRAN use,


install.packages("KernelKnn")

and to download the latest version from Github use the install_github function of the devtools package,


devtools::install_github('mlampros/KernelKnn')

Use the following link to report bugs/issues,

https://github.com/mlampros/KernelKnn/issues

UPDATE 29-11-2019

Docker images of the KernelKnn package are available to download from my dockerhub account. The images come with Rstudio and the R-development version (latest) installed. The whole process was tested on Ubuntu 18.04. To pull & run the image do the following,


docker pull mlampros/kernelknn:rstudiodev

docker run -d --name rstudio_dev -e USER=rstudio -e PASSWORD=give_here_your_password --rm -p 8787:8787 mlampros/kernelknn:rstudiodev

The user can also bind a home directory / folder to the image to use its files by specifying the -v command,


docker run -d --name rstudio_dev -e USER=rstudio -e PASSWORD=give_here_your_password --rm -p 8787:8787 -v /home/YOUR_DIR:/home/rstudio/YOUR_DIR mlampros/kernelknn:rstudiodev

In the latter case you might have first give permission privileges for write access to YOUR_DIR directory (not necessarily) using,


chmod -R 777 /home/YOUR_DIR

The USER defaults to rstudio but you have to give your PASSWORD of preference (see https://rocker-project.org for more information).

Open your web-browser and depending where the docker image was build / run give,

1st. Option on your personal computer,

http://0.0.0.0:8787 

2nd. Option on a cloud instance,

http://Public DNS:8787

to access the Rstudio console in order to give your username and password.

Citation:

If you use the KernelKnn R package in your paper or research please cite https://CRAN.R-project.org/package=KernelKnn/citation.html:

@Manual{,
  title = {{KernelKnn}: Kernel k Nearest Neighbors},
  author = {Lampros Mouselimis},
  year = {2021},
  note = {R package version 1.1.5},
  url = {https://CRAN.R-project.org/package=KernelKnn},
}

Copy Link

Version

Install

install.packages('KernelKnn')

Monthly Downloads

18,460

Version

1.1.5

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Lampros Mouselimis

Last Published

January 6th, 2023

Functions in KernelKnn (1.1.5)

FUN_kernels

performs kernel smoothing using a bandwidth. Besides using a kernel there is also the option to combine kernels
func_categorical_preds

OPTION to convert categorical features TO either numeric [ if levels more than 32] OR to dummy variables [ if levels less than 32 ]
func_shuffle

shuffle data
distMat.KernelKnn

kernel k-nearest-neighbors using a distance matrix
func_tbl

this function returns a table of probabilities for each label
switch.ops

Arithmetic operations on lists
distMat.knn.index.dist

indices and distances of k-nearest-neighbors using a distance matrix
ionosphere

Johns Hopkins University Ionosphere database (binary classification)
knn.index.dist

indices and distances of k-nearest-neighbors
Boston

Boston Housing Data (Regression)
FUNCTION_weights

this function is used as a kernel-function-identifier [ takes the distances and a weights-kernel (in form of a function) and returns weights ]
func_tbl_dist

this function returns the probabilities in case of classification
KernelKnn

kernel k-nearest-neighbors
KernelKnnCV

kernel-k-nearest-neighbors using cross-validation
class_folds

stratified folds (in classification) [ detailed information about class_folds in the FeatureSelection package ]
regr_folds

create folds (in regression) [ detailed information about class_folds in the FeatureSelection package ]
normalized

this function normalizes the data