StaggerAlignment(myXStringSet, tree = NULL, threshold = 3, fullLength = FALSE, processors = 1, verbose = TRUE)
AAStringSet
, DNAStringSet
, or RNAStringSet
object of aligned sequences.
dendrogram
representing the evolutionary relationships between sequences, such as that created by IdClusters
. The root should be the topmost node of the tree
.
TRUE
), or terminal gaps should be treated as missing data (FALSE
, the default). Either a single logical, a vector with one logical per sequence, or a list with right
and left
components containing logicals for the right and left sides of the alignment.
NULL
to automatically detect and use all available processors.
XStringSet
of aligned sequences.
StaggerAlignment
creates a ``staggered alignment'' which separates regions of the alignment that are likely not homologous into separate regions. This re-balances the trade-off between true positives and false positives by decreasing the number of false homologies at the loss of some true homologies. The resulting alignment is less aesthetically pleasing because it is widened by the introduction of many gaps. However, in an evolutionary sense a staggered alignment is more correct because each aligned position represents a hypothesis about evolutionary events: overlapping characters between any two sequences represent positions common to their ancestor sequence that may have evolved through substitution.The single parameter threshold
controls the degree of staggering. Its value represents the ratio of insertions to deletions that must be crossed in order to stagger a region. A threshold
of 1
would mean any region that could be better explained by separate insertions than deletions should be staggered. A higher value for threshold
makes it more likely to stagger, and vise-versa. A very high value would conservatively stagger most regions with gaps, resulting in few false homologies but also fewer true homologies. The default value (3
) is intended to remove more false homologies than it eliminates in true homologies. It may be preferable to tailor the threshold
depending on the purpose of the alignment, as some downstream procedures (such as tree building) may be more or less sensitive to false homologies.
AdjustAlignment
, AlignSeqs
, IdClusters
db <- system.file("extdata", "Bacteria_175seqs.sqlite", package="DECIPHER")
dna <- SearchDB(db, remove="all")
alignedDNA <- AlignSeqs(dna)
staggerDNA <- StaggerAlignment(alignedDNA)
BrowseSeqs(staggerDNA, highlight=1)
Run the code above in your browser using DataLab