Learn R Programming

biogram (version 1.6.3)

create_ngrams: Get all possible n-Grams

Description

Creates the vector of all possible n_grams (for given n).

Usage

create_ngrams(n, u, possible_grams = NULL)

Arguments

n

integer size of n-gram.

u

integer, numeric or character vector of all possible unigrams.

possible_grams

number of possible n-grams. If not NULL n-grams do not contain information about position

Value

a character vector. Elements of n-gram are separated by dot.

Details

See Details section of count_ngrams for more information about n-grams naming convention. The possible information about distance must be added by hand (see examples).

Examples

Run this code
# NOT RUN {
# bigrams for standard aminoacids
create_ngrams(2, 1L:20)
# bigrams for standard aminoacids with positions, 10 amino acid long sequence, so 
# only 9 bigrams can be located in sequence
create_ngrams(2, 1L:20, 9)
# bigrams for DNA with positions, 10 nucleotide long sequence, distance 1, so only 
# 8 bigrams in sequence
# paste0 adds information about distance at the end of n-gram
paste0(create_ngrams(2, 1L:4, 8), "_0")
# }

Run the code above in your browser using DataLab