These functions call their respective program from R to align a set
of nucleotide sequences of class "DNAbin"
or
"AAbin"
. The application(s) must be installed seperately and it
is highly recommended to do this so that the executables are in a
directory located on the PATH of the system.
This version includes an experimental version of muscle5
which
calls MUSCLE5 (see the link to the documentation in the References
below); muscle
still calls MUSCLE version 3. Note that the
executable of MUSCLE5 is also named `muscle' by the default
compilation setting.
The functions efastats
and letterconf
require MUSCLE5.
clustal(x, y, guide.tree, pw.gapopen = 10, pw.gapext = 0.1,
gapopen = 10, gapext = 0.2, exec = NULL, MoreArgs = "",
quiet = TRUE, original.ordering = TRUE, file)
clustalomega(x, y, guide.tree, exec = NULL,MoreArgs = "",
quiet = TRUE, original.ordering = TRUE, file)
muscle(x, y, guide.tree, exec, MoreArgs = "",
quiet = TRUE, original.ordering = TRUE, file)
muscle5(x, exec = "muscle", MoreArgs = "", quiet = FALSE,
file, super5 = FALSE, mc.cores = 1)
tcoffee(x, exec = "t_coffee", MoreArgs = "", quiet = TRUE,
original.ordering = TRUE)efastats(X, exec = "muscle", quiet = FALSE)
letterconf(X, exec = "muscle")
an object of class "DNAbin"
or "AAbin"
with the aligned
sequences.
efastats
returns a data frame.
letterconf
opens the default Web brower.
an object of class "DNAbin"
or "AAbin"
(can be
missing).
an object of class "DNAbin"
or "AAbin"
used for
profile alignment (can be missing).
guide tree, an object of class "phylo"
(can
be missing).
gap opening and gap extension penalties used by Clustal during pairwise alignments.
idem for global alignment.
a character string giving the name of the program, with
its path if necessary. clustal
tries to guess this argument
depending on the operating system (see details).
a character string giving additional options.
a logical: the default is to not print on R's console the messages from the external program.
a logical specifying whether to return the
aligned sequences in the same order than in x
(TRUE
by
default).
a file with its path if results should be stored (can be missing).
a logical value. By default, the PPP algorithm is used.
the number of cores to be used by MUSCLE5.
a list with several alignments of the same sequences with all with the same row order.
Emmanuel Paradis, Franz Krah
It is highly recommended to install the executables properly so that
they are in a directory located on the PATH (i.e., accessible from any
other directory). Alternatively, the full path to the executable
may be given (e.g., exec = "~/muscle/muscle"
), or a (symbolic)
link may be copied in the working directory. For Debian and its
derivatives (e.g., Ubuntu), it is recommended to use the binaries
distributed by Debian.
clustal
tries to guess the name of the executable program
depending on the operating system. Specifically, the followings are
used: ``clustalw'' under Linux, ``clustalw2'' under MacOS, and
``clustalw2.exe'' under Windows. For clustalomega
,
``clustalo[.exe]'' is the default on all systems (with no specific
path).
When called without arguments (i.e., clustal()
, ...), the
function prints the options of the program which may be passed to
MoreArgs
.
Since ape 5.1, clustal
, clustalomega
, and
muscle
can align AA sequences as well as DNA sequences.
Chenna, R., Sugawara, H., Koike, T., Lopez, R., Gibson, T. J., Higgins, D. G. and Thompson, J. D. (2003) Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Research 31, 3497--3500. http://www.clustal.org/
Edgar, R. C. (2004) MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32, 1792--1797. http://www.drive5.com/muscle/muscle_userguide3.8.html
Notredame, C., Higgins, D. and Heringa, J. (2000) T-Coffee: A novel method for multiple sequence alignments. Journal of Molecular Biology, 302, 205--217. https://tcoffee.org/
Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., S\"oding, J., Thompson, J. D. and Higgins, D. G. (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology, 7, 539. http://www.clustal.org/
image.DNAbin
, del.gaps
,
all.equal.DNAbin
, alex
,
alview
, checkAlignment
if (FALSE) {
### display the options:
clustal()
clustalomega()
muscle()
tcoffee()
data(woodmouse)
### open gaps more easily:
clustal(woodmouse, pw.gapopen = 1, pw.gapext = 1)
### T-Coffee requires negative values (quite slow; muscle() is much faster):
tcoffee(woodmouse, MoreArgs = "-gapopen=-10 -gapext=-2")
}
Run the code above in your browser using DataLab