Learn R Programming

kebabs (version 1.6.2)

seqKernelAsChar: Sequence Kernel

Description

Create the kernel matrix for a kernel object Retrieve kernel parameters from the kernel object

Usage

seqKernelAsChar(from)

getKernelMatrix(kernel, x, y, selx, sely)

## S3 method for class 'SpectrumKernel': kernelParameters(object)

## S3 method for class 'MismatchKernel': kernelParameters(object)

## S3 method for class 'GappyPairKernel': kernelParameters(object)

## S3 method for class 'MotifKernel': kernelParameters(object)

## S3 method for class 'SymmetricPairKernel': kernelParameters(object)

## S3 method for class 'SequenceKernel': isUserDefined(object)

Arguments

from
a sequence kernel object
kernel
one kernel object of class SequenceKernel or one kernlab string kernel (see stringdot
x
one or multiple biological sequences in the form of a DNAStringSet, RNAStringSet, AAStringSet (or as BioVector)
y
one or multiple biological sequences in the form of a DNAStringSet, RNAStringSet, AAStringSet (or as BioVector); if this parameter is specified a rectangular kernel matrix with the samples in x as rows and the samples in y as columns is generated otherwise a square kernel matrix with samples in x as rows and columns is computed; default=NULL
selx
subset of indices into x; when this parameter is present the kernel matrix is generated for the specified subset of x only; default=NULL
sely
subset of indices into y; when this parameter is present the kernel matrix is generated for the specified subset of y only; default=NULL
object
a sequence kernel object

Value

  • getKernelMatrix: upon successful completion, the function returns a kernel matrix of class KernelMatrix which contains similarity values between pairs of the biological sequences.

    kernelParameters: the kernel parameters as list

    isUserDefined: boolean indicating whether kernel is user-defined or not

cr

The function 'kernelParameters' retrieves the kernel parameters and returns them as list. The function 'seqKernelAsChar' converts a sequnce kernel object into a character string. Generation of kernel matrix The function getKernelMatrix creates a kernel matrix for the specified kernel and one or two given sets of sequences. It contains similarity values between pairs of samples. If one set of sequences is used the square kernel matrix contains pairwise similarity values for this set. For two sets of sequences the similarities are calculated between these sets resulting in a rectangular kernel matrix. The kernel matrix is always created as dense matrix of the class KernelMatrix. Alternatively the kernel matrix can also be generated via a direct function call with the kernel object. (see examples below) Generation of explicit representation With the function getExRep an explicit representation for a specified kernel and a given set of sequences can be generated in sparse or dense form. Applying the linear kernel to the explicit representation with the function linearKernel also generates a dense kernel matrix.

Details

Sequence Kernel A sequence kernel is used for determination of similarity values between biological sequences based on patterns occuring in the sequences. The kernels in this package were specifically written for the biological domain. The corresponding term in the kernlab package is string kernel which is a domain independent implementation of the same functionality which often used in other domains, for example in text classification. For the sequence kernels in this package DNA-, RNA- or AA-acid sequences are used as input with a reduced character set compared to regular text. In string kernels the actual position of a pattern in the sequence/text is irrelevant just the number of occurances of the pattern is important for the similarity consideration. The kernels provided in this package can be created in a position-independent or position-dependent way. Position dependent kernels are using the postion of patterns on the pair of sequences to determine the contribution of a pattern match to the similarity value. For details see help page for positionMetadata. As second method of specializing similarity consideration in a kernel is to use annotation information which is placed along the sequences. For details see annotationMetadata. Following kernels are available:
  • spectrum kernel
mismatch kernel gappy pair kernel motif kernel

References

http://www.bioinf.jku.at/software/kebabs J. Palme, S. Hochreiter, and U. Bodenhofer (2015) KeBABS: an R package for kernel-based analysis of biological sequences. Bioinformatics, 31(15):2574-2576, 2015. DOI: http://dx.doi.org/10.1093/bioinformatics/btv176{10.1093/bioinformatics/btv176}.

See Also

as.KernelMatrix, KernelMatrix, spectrumKernel, mismatchKernel, gappyPairKernel, motifKernel

Examples

Run this code
## instead of user provided sequences in XStringSet format
## for this example a set of DNA sequences is created
## RNA- or AA-sequences can be used as well with the motif kernel
dnaseqs <- DNAStringSet(c("AGACTTAAGGGACCTGGTCACCACGCTCGGTGAGGGGGACGGGGTGT",
                          "ATAAAGGTTGCAGACATCATGTCCTTTTTGTCCCTAATTATTTCAGC",
                          "CAGGAATCAGCACAGGCAGGGGCACGGCATCCCAAGACATCTGGGCC",
                          "GGACATATACCCACCGTTACGTGTCATACAGGATAGTTCCACTGCCC",
                          "ATAAAGGTTGCAGACATCATGTCCTTTTTGTCCCTAATTATTTCAGC"))
names(dnaseqs) <- paste("S", 1:length(dnaseqs), sep="")

## create the kernel object with the spectrum kernel
spec <- spectrumKernel(k=3, normalized=FALSE)

## generate the kernel matrix
km <- getKernelMatrix(spec, dnaseqs)
dim(km)
km[1:5,1:5]

## alternative way to generate the kernel matrix
km <- spec(dnaseqs)
km[1:5,1:5]

## generate rectangular kernel matrix
km <- getKernelMatrix(spec, x=dnaseqs, selx=1:3, y=dnaseqs, sely=4:5)
dim(km)
km[1:3,1:2]

## generate a sparse explicit representation
er <- getExRep(dnaseqs, spec)
er[1:5, 1:8]

## generate kernel matrix from explicit representation
km <- linearKernel(er)
km[1:5,1:5]

Run the code above in your browser using DataLab