Learn R Programming

Biostrings (version 2.40.2)

findPalindromes: Searching a sequence for palindromes

Description

The findPalindromes function can be used to find palindromic regions in a sequence.

palindromeArmLength, palindromeLeftArm, and palindromeRightArm are utility functions for operating on palindromic sequences.

Usage

findPalindromes(subject, min.armlength=4, max.looplength=1, min.looplength=0, max.mismatch=0) palindromeArmLength(x, max.mismatch=0, ...) palindromeLeftArm(x, max.mismatch=0, ...) palindromeRightArm(x, max.mismatch=0, ...)

Arguments

subject
An XString object containing the subject string, or an XStringViews object.
min.armlength
An integer giving the minimum length of the arms of the palindromes to search for.
max.looplength
An integer giving the maximum length of "the loop" (i.e the sequence separating the 2 arms) of the palindromes to search for. Note that by default (max.looplength=1), findPalindromes will search for strict palindromes only.
min.looplength
An integer giving the minimum length of "the loop" of the palindromes to search for.
max.mismatch
The maximum number of mismatching letters allowed between the 2 arms of the palindromes to search for.
x
An XString object containing a 2-arm palindrome, or an XStringViews object containing a set of 2-arm palindromes.
...
Additional arguments to be passed to or from methods.

Value

findPalindromes returns an XStringViews object containing all palindromes found in subject (one view per palindromic substring found).palindromeArmLength returns the arm length (integer) of the 2-arm palindrome x. It will raise an error if x has no arms. Note that any sequence could be considered a 2-arm palindrome if we were OK with arms of length 0 but we are not: x must have arms of length greater or equal to 1 in order to be considered a 2-arm palindrome. When applied to an XStringViews object x, palindromeArmLength behaves in a vectorized fashion by returning an integer vector of the same length as x.palindromeLeftArm returns an object of the same class as the original object x and containing the left arm of x.palindromeRightArm does the same as palindromeLeftArm but on the right arm of x.Like palindromeArmLength, both palindromeLeftArm and palindromeRightArm will raise an error if x has no arms. Also, when applied to an XStringViews object x, both behave in a vectorized fashion by returning an XStringViews object of the same length as x.

Details

The findPalindromes function finds palindromic substrings in a subject string. The palindromes that can be searched for are either strict palindromes or 2-arm palindromes (the former being a particular case of the latter) i.e. palindromes where the 2 arms are separated by an arbitrary sequence called "the loop".

If the subject string is a nucleotide sequence (i.e. DNA or RNA), the 2 arms must contain sequences that are reverse complement from each other. Otherwise, they must contain sequences that are the same.

See Also

maskMotif, matchPattern, matchLRPatterns, matchProbePair, XStringViews-class, DNAString-class

Examples

Run this code
x0 <- BString("abbbaabbcbbaccacabbbccbcaabbabacca")

pals0a <- findPalindromes(x0, min.armlength=3, max.looplength=5)
pals0a
palindromeArmLength(pals0a)
palindromeLeftArm(pals0a)
palindromeRightArm(pals0a)

pals0b <- findPalindromes(x0, min.armlength=9, max.looplength=5,
                          max.mismatch=3)
pals0b
palindromeArmLength(pals0b, max.mismatch=3)
palindromeLeftArm(pals0b, max.mismatch=3)
palindromeRightArm(pals0b, max.mismatch=3)

## Whitespaces matter:
x1 <- BString("Delia saw I was aileD")
palindromeArmLength(x1)
palindromeLeftArm(x1)
palindromeRightArm(x1)

x2 <- BString("was it a car or a cat I saw")
palindromeArmLength(x2)
palindromeLeftArm(x2)
palindromeRightArm(x2)

## On a DNA or RNA sequence:
x3 <- DNAString("CCGAAAACCATGATGGTTGCCAG")
findPalindromes(x3)
findPalindromes(RNAString(x3))

## Note that palindromes can be nested:
x4 <- DNAString("ACGTTNAACGTCCAAAATTTTCCACGTTNAACGT")
findPalindromes(x4, max.looplength=19)

## A real use case:
library(BSgenome.Dmelanogaster.UCSC.dm3)
chrX <- Dmelanogaster$chrX
chrX_pals0 <- findPalindromes(chrX, min.armlength=40, max.looplength=80)
chrX_pals0
palindromeArmLength(chrX_pals0)  # 251 70 262

## Allowing up to 2 mismatches between the 2 arms:
chrX_pals2 <- findPalindromes(chrX, min.armlength=40, max.looplength=80,
                              max.mismatch=2)
chrX_pals2
palindromeArmLength(chrX_pals2, max.mismatch=2)  # 254 77 44 48 40 264

Run the code above in your browser using DataLab