find_shortest_costmatrix_path: Finds the shortest path between two states in a costmatrix

Description

Given a start and end state, returns the shortest path through a costmatrix.

Usage

find_shortest_costmatrix_path(costmatrix, start, end)

Value

A vector of states describing (one) of the optimal path(s) in the order start to end.

Arguments

costmatrix: An object of class costMatrix.
start: The start state for the requested path.
end: The end state of the requested path.

Author

Graeme T. Lloyd graemetlloyd@gmail.com

Details

A common problem in graph theory is identifying the shortest path to take between two vertices of a connected graph. A costmatrix also describes a graph, with transition costs representing edge weights that can be asymmetric (e.g., going from 0 to 1 can have a different weight than going from 1 to 0). Finding the shortest path between states - i.e., the path that minimises total weight (cost) - from a costmatrix has two main applications in Claddis: 1. to check a costmatrix is internally consistent (no cost is misidentified due to a "cheaper" route being available - solving a problem identified in Maddison and Maddison 2003), and 2. to identify the minimum cost a character could have on a tree (an essential aspect of various homoplasy metrics, see Hoyal Cuthill et al. 2010).

The function returns a vector describing (one) shortest (i.e., lowest cost) path between start and end. If the direct path is shortest this will be simply start and end, but if an indirect route is cheaper then other node(s) will appear between these values.

In operation the function is inspired by Dijkstra's algorithm (Dijkstra 1959) but differs in some aspects to deal with the special case of a cladistic-style costmatrix. Essentially multiple paths are considered with the algorithm continuing until either the destination node (end) is reached or the accumulated cost (path length) exceeds the direct cost (meaning the path cannot be more optimal, lower cost, than the direct one).

Note: that because infinite costs are allowed in costmatrices to convey that a particular transition is not allowed these are always likely to be replaced (meanng the path does become possible) unless they apply to entire rows or columns of a costmatrix.

Note: negative costs are not allowed in costmatrices as they will confound the halting criteria of the algorithm. (They also do not make logical sense in a costmatrix anyway!)

Note: if multiple equally optimal solutions are possible, the function will only return one of them. I.e., just because a solution is not presented it cannot be assumed it is suboptimal.

References

Dijkstra, E. W., 1959. A note on two problems in connexion with graphs. Numerische Mathematik, 1, 269–271.

Hoyal Cuthill, J. F., Braddy, S. J. and Donoghue, P. C. J., 2010. A formula for maximum possible steps in multistate characters: isolating matrix parameter effects on measures of evolutionary convergence. Cladistics, 26, 98-102.

Maddison, D. R. and Maddison, W. P., 2003. MacClade 4: Analysis of phylogeny and character evolution. Version 4.06. Sinauer Associates, Sunderland, Massachusetts.

Examples

Run this code


# Make a four-state Dollo costmatrix:
costmatrix <- make_costmatrix(
  min_state = 0,
  max_state = 3,
  character_type = "dollo",
  dollo_penalty = 100
)

# Find the shortest path from state 0 to state 3:
find_shortest_costmatrix_path(
  costmatrix = costmatrix,
  start = "0",
  end = "3"
)

# Show that this is directional by asking for the reverse path:
find_shortest_costmatrix_path(
  costmatrix = costmatrix,
  start = "3",
  end = "0"
)

Run the code above in your browser using DataLab