From a set of quartet concordance factors obtained from genetic data (proportion of loci that truly have a given quartet) and from a guide tree, this functions uses a stepwise search to find the best resolution of that guide tree. Any unresolved edge corresponds to ancestral panmixia, on which the coalescent process is assumed.
stepwise.test.tree(cf, guidetree, search="both", method="PLL", kbest=5,
maxiter=100, startT="panmixia", shape.correction=TRUE)
Number of edges kept resolved in the guide tree. Other edges are collapsed to model ancestral panmixia.
Indices of edges kept resolved in the guide tree.
Indices of edges collapsed in the guide tree, to model ancestral panmixia.
estimated \(\alpha\) parameter.
Negative pseudo log-likelihood of the final estimated population tree.
Chi-square statistic, from comparing the counts of outlier p-values
(in outlier.table
) to the expected counts.
p-value from the chi-square test, obtained from the comparing the X2
value to a chi-square distribution with 3 df.
character string. If the chi-square test is significant, this statement says if there is an excess (or deficit) of outlier 4-taxon sets.
Table with 2 rows (observed and expected counts) and 4 columns: number of 4-taxon sets with p-values \(p\leq 0.01\), \(0.01<p\leq 0.05\), \(0.05<p\leq 0.10\) or \(p>0.10\).
Vector of outlier p-values, with as many entries as there
are rows in cf
, one for each set of 4 taxa.
Matrix of concordance factors expected from the estimated population tree,
with as many rows as in cf
(one row for each 4-taxon set) and 3 columns
(one for each of the 3 possible quartet trees).
data frame containing one row for each 4-taxon set and containing taxon names in columns 1-4, and concordance factors in columns 5-7.
tree of class phylo on the same taxon set as those in cf
,
with branch lengths in coalescent units.
one of "both" (stepwise search both forwards and backwards at each step), or "heuristic" (heuristic shallow search: not recommended).
Only "PLL" is implemented. The scoring criterion to rank population trees is the pseudo log-likelihood (ignored if search="heuristic").
Number of candidate population trees to consider at each step for the forward and for the backward phase (separately). Use a lower value for faster but less thorough search.
Maximum number of iterations. One iteration consists of considering multiple candidate population trees, using both a forward step and a backward step.
starting population tree. One of "panmixia", "fulltree", or a numeric vector of edge numbers to keep resolved. The other edges are collapsed for panmixia.
boolean. If true, the shapes of all Dirichlet distributions used to test the adequacy of a population tree are corrected to be greater or equal to 1. This correction avoids Dirichlet densities going near 0 or 1. It is applied both when the \(\alpha\) parameter is estimated and when the outlier p-values are calculated.
Cécile Ané
Stenz, Noah W. M., Bret Larget, David A. Baum and Cécile Ané (2015). Exploring tree-like and non-tree-like patterns using genome sequences: An example using the inbreeding plant species Arabidopsis thaliana (L.) Heynh. Systematic Biology, 64(5):809-823.
test.one.species.tree
.
data(quartetCF)
data(guidetree)
resF <- stepwise.test.tree(quartetCF,guidetree,startT="fulltree") # takes ~ 1 min
resF[1:9]
Run the code above in your browser using DataLab