Learn R Programming

SuperLearner (version 2.0-29)

SL.kernelKnn: SL wrapper for KernelKNN

Description

Wrapper for a configurable implementation of k-nearest neighbors. Supports both binomial and gaussian outcome distributions.

Usage

SL.kernelKnn(Y, X, newX, family, k = 10, method = "euclidean",
  weights_function = NULL, extrema = F, h = 1, ...)

Value

List with predictions and the original training data & hyperparameters.

Arguments

Y

Outcome variable

X

Training dataframe

newX

Test dataframe

family

Gaussian or binomial

k

Number of nearest neighbors to use

method

Distance method, can be 'euclidean' (default), 'manhattan', 'chebyshev', 'canberra', 'braycurtis', 'pearson_correlation', 'simple_matching_coefficient', 'minkowski' (by default the order 'p' of the minkowski parameter equals k), 'hamming', 'mahalanobis', 'jaccard_coefficient', 'Rao_coefficient'

weights_function

Weighting method for combining the nearest neighbors. Can be 'uniform' (default), 'triangular', 'epanechnikov', 'biweight', 'triweight', 'tricube', 'gaussian', 'cosine', 'logistic', 'gaussianSimple', 'silverman', 'inverse', 'exponential'.

extrema

if TRUE then the minimum and maximum values from the k-nearest-neighbors will be removed (can be thought as outlier removal).

h

the bandwidth, applicable if the weights_function is not NULL. Defaults to 1.0.

...

Any additional parameters, not currently passed through.

Examples

Run this code

# Load a test dataset.
data(PimaIndiansDiabetes2, package = "mlbench")

data = PimaIndiansDiabetes2

# Omit observations with missing data.
data = na.omit(data)

Y_bin = as.numeric(data$diabetes)
X = subset(data, select = -diabetes)

set.seed(1)

sl = SuperLearner(Y_bin, X, family = binomial(),
                 SL.library = c("SL.mean", "SL.kernelKnn"))
sl

Run the code above in your browser using DataLab