divisive_ftree: Hierarchical divisive clustering of components

Description

We proceed by division, varying the number of functional groups of components from 1 to the number of components. All components are initially regrouped into a single, large, trivial functional group. At each step, one of the functional groups is split into two new functional groups: the new functional groups selected are those that minimize the Residual Sum of Squares of the clustering. The process stops when each component is isolated in a singleton, that is when there are so many clsyters as components. As a whole, the process generates a hierarchical divisive tree of component clustering, whose RSS decreases monotonically with the number of functional groups.

Usage

divisive_ftree(fobs, mOccur, xpr, opt.mean, opt.model, opt.nbMax)

Arguments

fobs

a numeric vector. The vector fobs contains the quantitative performances of assemblages.

mOccur

a matrix of occurrence (occurrence of elements). Its first dimension equals to length(fobs). Its second dimension equals to the number of elements.

xpr

a vector of numerics of length(fobs). The vector xpr contains the weight of each experiment, and the labels (in names(xpr)) of different experiments. The weigth of each experiment is used in the computation of the Residual Sum of Squares in the function rss_clustering. The used formula is rss if each experiment has the same weight. The used formula is wrss (barycenter of RSS for each experiment) if each experiment has different weights. All assemblages that belong to a given experiment should then have a same weigth. Each experiment is identified by its names (names(xpr)) and the RSS of each experiment is weighted by values of xpr. The vector xpr is generated by the function stats::setNames.

opt.mean

a character equals to "amean" or "gmean". Switchs to arithmetic formula if opt.mean = "amean". Switchs to geometric formula if opt.mean = "gmean".

Modelled performances are computed using arithmetic mean (opt.mean = "amean") or geometric mean (opt.mean = "gmean") according to opt.model.

opt.model

a character equals to "bymot" or "byelt". Switchs to simple mean by assembly motif if opt.model = "bymot". Switchs to linear model with assembly motif if opt.model = "byelt".

If opt.model = "bymot", modelled performances are means of performances of assemblages that share a same assembly motif by including all assemblages that belong to a same assembly motif.

If opt.model = "byelt", modelled performances are the average of mean performances of assemblages that share a same assembly motif and that contain the same components as the assemblage to predict. This procedure corresponds to a linear model within each assembly motif based on the component occurrence in each assemblage. If no assemblage contains component belonging to assemblage to predict, performance is the mean performance of all assemblages as in opt.model = "bymot".

opt.nbMax

an integer, comprizes between 1 and nbElt, that indicates the last level of hierarchical tree to compute. This option is very useful to shorten computing-time in the test-functions ftest_components, ftest_assemblages, ftest_performances, fboot_assemblages, fboot_performances or ftest where the function fit_ftree is run very numerous times.

Value

Return an object "tree", that is a list containing (i) tree$aff: an integer square-matrix of component affectation to functional groups, (ii) tree$cor: a numeric vector of coefficient of determination.

Details

At each hierarchical level of the divisive tree, the division of the existing functional groups into new functional groups proceeds as follows. Each existing functional group is successively split into two new functional groups. To do that, each component of the functional group is isolated into a singleton: the singleton-component that minimizes RSS is selected as the nucleus of the new functional group. Each of the other components belonging to the existing functional group is successively moved towards the new functional group: the component clustering that minimizes RSS is kept. Moving component into the new functional group continues as long as the new component clustering decreases RSS.