Learn R Programming

rDNAse (version 1.1-1)

parSeqSim: Parallellized DNA/RNA Sequence Similarity Calculation based on Sequence Alignment

Description

Parallellized DNA/RNA Sequence Similarity Calculation based on Sequence Alignment

Usage

parSeqSim(dnalist, cores = 2, type = "local", submat = "BLOSUM62")

Arguments

dnalist
A length n list containing n DNA/RNA sequences, each component of the list is a character string, storing one DNA/RNA sequence. Unknown sequences should be represented as ''.
cores
Integer. The number of CPU cores to use for parallel execution, default is 2. Users could use the detectCores() function in the parallel package to see how many cores they could use.
type
Type of alignment, default is 'local', could be 'global' or 'local', where 'global' represents Needleman-Wunsch global alignment; 'local' represents Smith-Waterman local alignment.
submat
Substitution matrix, default is 'BLOSUM62', could be one of 'BLOSUM45', 'BLOSUM50', 'BLOSUM62', 'BLOSUM80', 'BLOSUM100', 'PAM30', 'PAM40', 'PAM70', 'PAM120', 'PAM250'.

Value

A n x n similarity matrix.

Details

This function implemented the parallellized version for calculating DNA/RNA sequence similarity based on sequence alignment.

See Also

See twoSeqSim for DNA/RNA sequence alignment for two DNA/RNA sequences. See parGOSim for DNA/RNA similarity calculation based on Gene Ontology (GO) semantic similarity.

Examples

Run this code

# Be careful when testing this since it involves parallelisation
# and might produce unpredictable results in some environments

require(Biostrings)
require(foreach)
require(doParallel)

s1 = readFASTA(system.file('dnaseq/hs.fasta', package = 'rDNA'))[[1]]
s2 = readFASTA(system.file('dnaseq/hs.fasta', package = 'rDNA'))[[2]]
s3 = readFASTA(system.file('dnaseq/hs.fasta', package = 'rDNA'))[[3]]
s4 = readFASTA(system.file('dnaseq/hs.fasta', package = 'rDNA'))[[4]]
s5 = readFASTA(system.file('dnaseq/hs.fasta', package = 'rDNA'))[[5]]
plist = list(s1, s2, s3, s4, s5)
psimmat = parSeqSim(plist, cores = 2, type = 'local', submat = 'BLOSUM62')
print(psimmat)

Run the code above in your browser using DataLab