Learn R Programming

seqinr (version 4.2-36)

getTrans: Generic function to translate coding sequences into proteins

Description

This function translates nucleic acid sequences into the corresponding peptide sequence. It can translate in any of the 3 forward or three reverse sense frames. In the case of reverse sense, the reverse-complement of the sequence is taken. It can translate using the standard (universal) genetic code and also with non-standard codes. Ambiguous bases can also be handled.

Usage

getTrans(object, sens = "F", NAstring = "X", ambiguous = FALSE, ...)
# S3 method for SeqAcnucWeb
getTrans(object, sens = "F", NAstring = "X", ambiguous = FALSE, ...,
 frame = "auto", numcode = "auto")
# S3 method for SeqFastadna
getTrans(object, sens = "F", NAstring = "X", ambiguous = FALSE, ...,
 frame = 0, numcode = 1)
# S3 method for SeqFrag
getTrans(object, sens = "F", NAstring = "X", ambiguous = FALSE, ...,
 frame = 0, numcode = 1)

Value

For a single sequence an object of class character containing the characters of the sequence, either of length 1 when as.string is TRUE, or of the length of the sequence when as.string is FALSE. For many sequences, a list of these.

Arguments

object

an object of the class SeqAcnucWeb or SeqFastadna, or SeqFrag or a list of these objects, or an object of class qaw created by query

numcode

The ncbi genetic code number for translation. By default the standard genetic code is used, and for sequences coming from an ACNUC server the relevant genetic code is used by default.

NAstring

How to translate amino-acids when there are ambiguous bases in codons.

ambiguous

If TRUE, ambiguous bases are taken into account so that for instance GGN is translated to Gly in the standard genetic code.

frame

Frame(s) (0,1,2) to translate. By default the frame 0 is used.

sens

Direction for translation: F for the direct strand e and R for the reverse complementary strand.

...

further arguments passed to or from other methods

Author

D. Charif, J.R. Lobry, L. Palmeira

Details

The following genetic codes are described here. The number preceding each code corresponds to numcode.

1

standard

2

vertebrate.mitochondrial

3

yeast.mitochondrial

4

protozoan.mitochondrial+mycoplasma

5

invertebrate.mitochondrial

6

ciliate+dasycladaceal

9

echinoderm+flatworm.mitochondrial

10

euplotid

11

bacterial+plantplastid

12

alternativeyeast

13

ascidian.mitochondrial

14

alternativeflatworm.mitochondrial

15

blepharism

16

chlorophycean.mitochondrial

21

trematode.mitochondrial

22

scenedesmus.mitochondrial

23

hraustochytrium.mitochondria

References

citation("seqinr")

See Also

SeqAcnucWeb, SeqFastadna, SeqFrag
The genetic codes are given in the object SEQINR.UTIL, a more human readable form is given by the function tablecode. Use aaa to get the three-letter code for amino-acids.

Examples

Run this code
#
# List all available methods for getTrans generic function:
#
   methods(getTrans)
#
# Toy CDS example invented by Leonor Palmeira:
#
  toycds <- s2c("tctgagcaaataaatcgg")
  getTrans(toycds) # should be c("S", "E", "Q", "I", "N", "R")
#
# Toy CDS example with ambiguous bases:
#
  toycds2 <- s2c("tcngarcarathaaycgn")
  getTrans(toycds2) # should be c("X", "X", "X", "X", "X", "X")
  getTrans(toycds2, ambiguous = TRUE) # should be c("S", "E", "Q", "I", "N", "R")
  getTrans(toycds2, ambiguous = TRUE, numcode = 2) # should be c("S", "E", "Q", "X", "N", "R")
#
# Real CDS example:
#
  realcds <- read.fasta(file = system.file("sequences/malM.fasta", package ="seqinr"))[[1]]
  getTrans(realcds)
# Biologically correct, only one stop codon at the end
  getTrans(realcds, frame = 3, sens = "R", numcode = 6)
# Biologically meaningless, note the in-frame stop codons

# Read from an alignment as suggested by Dr. H. Suzuki
fasta.res    <- read.alignment(file = system.file("sequences/Anouk.fasta", package = "seqinr"),
 format = "fasta")

AA1 <- seqinr::getTrans(s2c(fasta.res$seq[[1]]))
AA2 <- seqinr::translate(s2c(fasta.res$seq[[1]]))
identical(AA1, AA2)

AA1 <- lapply(fasta.res$seq, function(x) seqinr::getTrans(s2c(x)))
AA2 <- lapply(fasta.res$seq, function(x) seqinr::translate(s2c(x)))
identical(AA1, AA2)

#
# Complex transsplicing operations, the correct frame and the correct
# genetic code are automatically used for translation into protein for
# sequences coming from an ACNUC server:
#
if (FALSE) {
  # Need internet connection.
  # Translation of the following EMBL entry:
  #
  # FT   CDS             join(complement(153944..154157),complement(153727..153866),
  # FT                   complement(152185..153037),138523..138735,138795..138955)
  # FT                   /codon_start=1
  choosebank("emblTP")
  trans <- query("trans", "N=AE003734.PE35")
  getTrans(trans$req[[1]])
}

Run the code above in your browser using DataLab