The Kernel Maximum Mean Discrepancy kmmd
performs
a non-parametric distribution test.
# S4 method for matrix
kmmd(x, y, kernel="rbfdot",kpar="automatic", alpha = 0.05,
asymptotic = FALSE, replace = TRUE, ntimes = 150, frac = 1, ...)# S4 method for kernelMatrix
kmmd(x, y, Kxy, alpha = 0.05,
asymptotic = FALSE, replace = TRUE, ntimes = 100, frac = 1, ...)
# S4 method for list
kmmd(x, y, kernel="stringdot",
kpar = list(type = "spectrum", length = 4), alpha = 0.05,
asymptotic = FALSE, replace = TRUE, ntimes = 150, frac = 1, ...)
An S4 object of class kmmd
containing the
results of whether the H0 hypothesis is rejected or not. H0 being
that the samples \(x\) and \(y\) come from the same distribution.
The object contains the following slots :
H0
is H0 rejected (logical)
AsympH0
is H0 rejected according to the asymptotic bound (logical)
kernelf
the kernel function used.
mmdstats
the test statistics (vector of two)
Radbound
the Rademacher bound
Asymbound
the asymptotic bound
see kmmd-class
for more details.
data values, in a matrix
,
list
, or kernelMatrix
data values, in a matrix
,
list
, or kernelMatrix
kernlMatrix
between \(x\) and \(y\) values (only for the
kernelMatrix interface)
the kernel function used in training and predicting.
This parameter can be set to any function, of class kernel, which computes a dot product between two
vector arguments. kernlab
provides the most popular kernel functions
which can be used by setting the kernel parameter to the following
strings:
rbfdot
Radial Basis kernel function "Gaussian"
polydot
Polynomial kernel function
vanilladot
Linear kernel function
tanhdot
Hyperbolic tangent kernel function
laplacedot
Laplacian kernel function
besseldot
Bessel kernel function
anovadot
ANOVA RBF kernel function
splinedot
Spline kernel
stringdot
String kernel
The kernel parameter can also be set to a user defined function of class kernel by passing the function name as an argument.
the list of hyper-parameters (kernel parameters). This is a list which contains the parameters to be used with the kernel function. Valid parameters for existing kernels are :
sigma
inverse kernel width for the Radial Basis
kernel function "rbfdot" and the Laplacian kernel "laplacedot".
degree, scale, offset
for the Polynomial kernel "polydot"
scale, offset
for the Hyperbolic tangent kernel
function "tanhdot"
sigma, order, degree
for the Bessel kernel "besseldot".
sigma, degree
for the ANOVA kernel "anovadot".
lenght, lambda, normalized
for the "stringdot" kernel
where length is the length of the strings considered, lambda the
decay factor and normalized a logical parameter determining if the
kernel evaluations should be normalized.
Hyper-parameters for user defined kernels can be passed
through the kpar
parameter as well. In the case of a Radial
Basis kernel function (Gaussian) kpar can also be set to the
string "automatic" which uses the heuristics in 'sigest' to
calculate a good 'sigma' value for the Gaussian RBF or
Laplace kernel, from the data. (default = "automatic").
the confidence level of the test (default: 0.05)
calculate the bounds asymptotically (suitable for smaller datasets) (default: FALSE)
use replace when sampling for computing the asymptotic bounds (default : TRUE)
number of times repeating the sampling procedure (default : 150)
fraction of points to sample (frac : 1)
additional parameters.
Alexandros Karatzoglou
alexandros.karatzoglou@ci.tuwien.ac.at
kmmd
calculates the kernel maximum mean discrepancy for
samples from two distributions and conducts a test as to whether the samples are
from different distributions with level alpha
.
Gretton, A., K. Borgwardt, M. Rasch, B. Schoelkopf and A. Smola
A Kernel Method for the Two-Sample-Problem
Neural Information Processing Systems 2006, Vancouver
https://papers.neurips.cc/paper/3110-a-kernel-method-for-the-two-sample-problem.pdf
ksvm
# create data
x <- matrix(runif(300),100)
y <- matrix(runif(300)+1,100)
mmdo <- kmmd(x, y)
mmdo
Run the code above in your browser using DataLab