Learn R Programming

microclass (version 1.2)

KmerCount: K-mer counting

Description

Counting overlapping words of length K in DNA/RNA sequences.

Usage

KmerCount(sequences, K = 1, col.names = FALSE)

Arguments

sequences

Vector of sequences (text).

K

Word length (integer).

col.names

Logical indicating if the words should be added as columns names.

Value

A matrix with one row for each sequence in sequences and one column for each possible word of lengthK.

Details

For each input sequence, the frequency of every word of length K is counted. Counting is done with overlap. The counting itself is done by a C++ function.

With col.names=TRUE the K-mers are added as column names, but this makes the computations slower.

See Also

multinomTrain, multinomClassify.

Examples

Run this code
# NOT RUN {
KmerCount("ATGCCTGAACTGACCTGC",K=2)

# }

Run the code above in your browser using DataLab