stringdot(length = 4, lambda = 1.1, type = "spectrum", normalized = TRUE)
spectrum
the kernel considers only matching substring of
exactly length $n$ (also know as string kernel). Each such matching
substring is given a constant weight. The length parameter in this
kernel has to be $length > 1$. boundrange
this kernel (also known as boundrange) considers only matching substrings of length less than or equal to a
given number N. This type of string kernel requires a length
parameter $length > 1$
constant
The kernel considers all matching substrings and assigns constant weight (e.g. 1) to each
of them. This constant
kernel does not require any additional
parameter.
exponential
Exponential Decay kernel where the substring weight decays as the
matching substring gets longer. The kernel requires a decay factor $
\lambda > 1$
string
essentially identical to the spectrum kernel, only
computed using a more conventional way.
fullstring
essentially identical to the boundrange kernel
only computed in a more conventional way.
TRUE
)stringkernel
which extents the
function
class. The resulting function implements the given
kernel calculating the inner (dot) product between two character vectors.
kpar
function.
kernel
argument on almost all
functions in kernlab(e.g., ksvm
, kpca
etc.).The string kernels calculate similarities between two strings (e.g. texts or sequences) by matching the common substring in the strings. Different types of string kernel exists and are mainly distinguished by how the matching is performed i.e. some string kernels count the exact matchings of $n$ characters (spectrum kernel) between the strings, others allow gaps (mismatch kernel) etc.
dots
, kernelMatrix
, kernelMult
, kernelPol
sk <- stringdot(type="string", length=5)
sk
Run the code above in your browser using DataLab