Takes a character state tree as input and returns the corresponding adjacency matrix.
convert_state_tree_to_adjacency_matrix(state_tree)
A symmetric adjacency matrix indicating which states are linked (value 1) by an edge.
A text string describing a character state tree.
Graeme T. Lloyd graemetlloyd@gmail.com
There are multiple ways to define discrete character states for use in phylogenetic analysis or ancestral state reconstruction in terms of transitions between states. The two most common being "ordered" ((Wagner optimization; Farris, 1970) or "unordered" (Fitch optimization; Fitch, 1971). These can be sonsidered as representing simple Markov models with symmetric transition probabilities, with the only difference being whether any direct state-tostae transition is possible (unordered) or only immediately "adjacent" states can be transitioned too.
In practice, however, there may be reason to believe in more complex transition models. For an example, see various characters in Hooker (2014). If these are still symmetric (i.e., going from state X to state Y is just as probable as going from state Y to state X) then they can be expressed using a "character state tree". For example, the model:
0-1-2
|
3
This model has four states (labelled 0-3). However, it is clearly not an ordered model:
0-1-2-3
Where only adjacent state transitions are possible, e.g., to get from state 0 to state 2 requires passing through state 1.
Nor is it an unordered model:
0---1
|\ /|
| X |
|/ \|
2---3
Where any state-to-state transition is possible.
In other words we cannot simply label this model as "ordered" or "unordered". It can, however, be expressed directly as character state tree:
(3,(2)1)0
Superficially these appear very similar to Newick strings (ways or representing bifurcating or multifurcating trees of species). However, the major difference is that with Newick strings we often only label the "tips" (terminal nodes, or vertices of degree one). Here there are no unlabelled "internal nodes", also known as "hypothetical ancestors". Instead, numbers after parentheses indicate the label of that node (vertex).
Thus we can read the above tree as node 2 is only directly connected to node 1, 3 and 1 are directly connected to node zero.
This is a very compact way of representing a Markov model, but unfortunately it is hard to read and difficult to do anything quantitative with. Thus this function takes such text strings and converts them to a more easily usable graph representation: the adjacency matrix.
Adjacency matrices represent graphs by showing links between vertices as values of one in symmetric square matrices. These always have a diagonal of zero (you cannot be adjacent to yourself) and any other vertices that are not directly linked are also coded as zero.
Note that here such matrices are also considered symmetric and hence the diagonal is also a line of reflection.
In practice more complex relationships may be desired, including differential weighting of specific transitions (without additional intermediate states; where going from state X to state Y has a cost other than zero or one), or asymmetric models (where going from state X to state Y has a different "cost" than going from state Y to state X). These are still possible in Claddis, and phylogenetic analysis in general, but require "costmatrices" to define them. (Character state trees are insufficent.)
Thus, costmatrices (or their probabilistic equivalent, Q-matrices) are the only generalisable form of defining any Markov model.
The output from this function can also be represented as a costmatrix with convert_adjacency_matrix_to_costmatrix.
Farris, J. S., 1970. Methods for computing Wagner trees. Systematic Zoology, 19, 83-92.
Fitch, W. M., 1971. Towards defining the course of evolution: minimum change for a specified tree topology. Systematic Zoology, 20, 406-416.
Hooker, J. J., 2014. New postcranial bones of the extinct mammalian family Nyctitheriidae (Paleogene, UK): primitive euarchontans with scansorial locomotion. Palaeontologia Electronica, 17.3.47A, 1-82.
convert_adjacency_matrix_to_costmatrix, locate_bracket_positions
# Convert a simple state tree to an adjacency matrix:
convert_state_tree_to_adjacency_matrix(
state_tree = "(3,(2)1)0"
)
# Convert a more complex state tree to an adjacency matrix:
convert_state_tree_to_adjacency_matrix(
state_tree = "(((5)4)3,(2)1)0"
)
Run the code above in your browser using DataLab