getClans(tree)
getClips(tree, all=TRUE)
getSlices(tree)
getDiversity(tree, x, norm=TRUE, var.names = NULL)
Slices are defined by a pair of splits or tripartitions, which are not clans. The number of distinguishable slices for a binary tree with $n$ tips is $2n^2 - 10n + 12$. A clip is a different type of partition, defining groups of leaves that are related in terms of evolutionary distances and not only topology. Namely, clips are groups of leaves for which all pairwise path-length distances are smaller than a given threshold value (Lapointe et al. 2010). There exists different numbers of clips for different thresholds, the largest (and trivial) one being the whole tree. There is always a clip containing only the two leaves with the smallest pairwise distance.
Clans, slices and clips can be used to characterize how well a vector of categorial characters (natives/intruders) fit on a tree. We will follow the definitions of Lapointe et al.(2010). A complete clan is a clan that contains all leaves of a given state/color, but can also contain leaves of another state/color. A clan is homogeneous if it only contains leaves of one state/color.
getDiversity computes either the Shannon Diversity: $H = -\sum_{i=1}^{k}(N_i/N) log(N_i/N), N=\sum_{i=1}^{k} N_i$ or the Equitability Index: $E = H / log(N)$ where $N_i$ are the sizes of the $k$ largest homogeneous clans of intruders. If the categories of the data can be separated by an edge of the tree then the E-value will be zero, and maximum equitability (E=1) is reached if all intruders are in separate clans. getDiversity computes these Intruder indices for the whole tree, complete clans and complete slices. Additionally the parsimony scores (p-scores) are reported. The p-score indicates if the leaves contain only one color (p-score=0), if the the leaves can be separated by a single split (perfect clan, p-score=1) or by a pair of splits (perfect slice, p-score=2).
So far only 2 states are supported (native, intruder), however it is also possible to recode several states into the native or intruder state using contrasts, for details see section 2 in vignette("phangorn-specials"). Furthermore unknown character states are coded as ambiguous character, which can act either as native or intruder minimizing the number of clans or changes (in parsimony analysis) needed to describe a tree for given data.
Wilkinson, M., McInerney, J.O., Hirt, R.P., Foster, P.G., Embley, T.M. (2007) Of clades and clans: terms for phylogenetic relationships in unrooted trees. Trends in Ecology and Evolution 22: 114-115
parsimony
, Consistency index CI
, Retention index RI
, phyDat
set.seed(111)
tree = rtree(10)
getClans(tree)
getClips(tree, all=TRUE)
getSlices(tree)
set.seed(123)
trees = rmtree(10, 20)
X = matrix(sample(c("red", "blue", "violet"), 100, TRUE, c(.5,.4, .1)), ncol=5,
dimnames=list(paste('t',1:20, sep=""), paste('Var',1:5, sep="_")))
x = phyDat(X, type = "USER", levels = c("red", "blue"), ambiguity="violet")
plot(trees[[1]], "u", tip.color = X[trees[[1]]$tip,1]) # intruders are blue
getDiversity(trees, x, var.names=colnames(X))
Run the code above in your browser using DataLab