Fit a primary tree of component clustering to observed assemblage performances, then prune the primary tree for its predicting ability and its parcimony, finally retain a validated secondary tree and the corresponding predictions, statistics and other informations.
fclust(dat, nbElt,
weight = rep(1, dim(dat)[2] - nbElt - 1),
opt.na = FALSE,
opt.repeat = FALSE,
opt.method = "divisive",
affectElt = rep(1, nbElt),
opt.mean = "amean",
opt.model = "byelt",
opt.jack = FALSE, jack = c(3,4) )
a data.frame or matrix that brings together:
a vector of assemblage identity,
a matrix of occurrence of components within the system,
one or more vectors of observed performances.
Consequently, the data.frame or matrix dimensions are:
dim(dat)[1]=
the number of observed assemblages,
* dim(dat)[2]=
1 + number of system components +
number of observed performances.
On a first line (colnames): assemblage identity,
a list of components identified by their names,
a list of performances identified by their names.
On following lines (a line by assemblage),
name of the assemblage (read as character),
a sequence of 0 (absence) and 1 (presence of component
within each assemblage)
(this is the matrix of occurrence of components within the system),
a sequence of numeric values for informed each observed performances
(this is the set of observed performances).
an integer, that specifies the number of components
belonging to interactive system.
nbElt
is used to know the dimension of matrix of occurrence.
a vector of numerics,
that specifies the weight of each performance.
By default, each performance is equally weighted.
If weight
is informed, it must have the same length
as the number of observed performances.
a logical.
The records for each assemblage can have NA
in matrix of occurrence or in observed assemblage performances.
If opt.na = FALSE
(by default), an error is returned.
If opt.na = TRUE
, the records with NA
are ignored.
a logical.
in any case, the function looks for
different assemblages with identical elemental composition.
Messages indicate these identical assemblages.
If opt.repeat = FALSE
(by default),
their performances are averaged.
If opt.repeat = TRUE
, nothing is done,
and the data are processed as they are.
a string that specifies the method to use.
opt.method = c("divisive", "agglomerative", "apriori")
.
The three methods generate hierarchical trees.
Each tree is complete, running from a unique trunk
to as many leaves as components.
If opt.method = "divisive"
, the components are clustered
by using a divisive method,
from the trivial cluster where all components are together,
towards the clustering where each component is a cluster.
This method gives the best result for several reasons,
exposed in detail in joined vignettes (see "The options of fclust").
If opt.method = "agglomerative"
, the components are clustered
by using an agglomerative method,
from the trivial clustering where each component is a cluster,
towards the cluster where all components are brought together
If all possible assemblages are not observed
(that is generally he case in practice),
the first clustering of few components can have no effect
on convergence criterion, indicing a non-optimum result.
If opt.method = "apriori"
, the user knows and gives
an "a priori" partitioning of the system components he is studying.
The partition is arbitrary, in any number of clusters of components,
but it must be specified (see following option affectElt
).
The tree is then built:
(i) by using opt.method = "divisive"
from the defined component clustering towards as many leaves as components;
(ii) by using opt.method = "agglomerative"
from the component clustering towards the trunk of tree.
a vector of characters or integers,
as long as the number of components nbElt
,
that indicates the labels of different functional clusters
to which each component belongs.
Each functional cluster is labelled as a character or an integer, and
each component must be identified by its name in names(affectElt)
.
The number of functional clusters defined in affectElt
determines an a priori level of component clustering
(level <- length(unique(affectElt))
).
If affectElt = NULL
(by default),
the option opt.method
must be specified.
If affectElt
is specified,
the option opt.method
switchs to apriori
.
a character, equals to "amean"
or "gmean"
.
If opt.mean = "amean"
,
means are computed using an arithmetic formula,
if opt.mean = "gmean"
,
mean are computed using a geometric formula.
a character equals to "bymot"
or "byelt"
.
If opt.model = "bymot"
,
the modelled performances are means
of performances of assemblages
that share a same assembly motif
by including all assemblages that belong to a same assembly motif.
If opt.model = "byelt"
,
the modelled performances are the average
of mean performances of assemblages
that share a same assembly motif
and that contain the same components
as the assemblage to predict.
This procedure corresponds to a linear model within each assembly motif
based on the component occurrence in each assemblage.
If no assemblage contains component belonging to assemblage to predict,
performance is the mean performance of all assemblages
as in opt.model = "bymot"
.
a logical, that switchs towards cross-validation method.
If opt.jack = FALSE
(by default), a Leave-One-Out method is used:
predicted performances are computed
as the mean of performances of assemblages
that share a same assembly motif,
experiment by experiment,
except the only assemblage to predict.
If opt.jack = TRUE
, a jackknife method is used:
the set of assemblages belonging to a same assembly motif is divided
into jack[2]
subsets of jack[1]
assemblages.
Predicted performances of each subset of jack[1]
assemblages
are computed, experiment by experiment,
by using the other (jack[2] - 1
) subsets of assemblages.
If the total number of assemblages belonging
to the assembly motif is lower than jack[1]*jack[2]
,
predictions are computed by Leave-One-Out method.
an integer vector of length 2
.
The vector specifies the parameters for jackknife method.
The first integer jack[1]
specifies the size of subset,
the second integer jack[2]
specifies the number of subsets.
Return a list containing the primary tree of component clustering, predictions of assembly performances and statistics computed by using the primary and secondary trees of component clustering.
Recall of inputs:
nbElt, nbAss, nbXpr
:
the number of components that belong to the interactive system,
the number of assemblages and the number of performances observed,
respectively.
opt.method, opt.mean, opt.model, opt.jack, jack, opt.na,
opt.repeat, affectElt
: the options used
for computing the resulting clustering trees,
respectively.
fobs, mOccur, xpr
:
the vector or matrix of observed performances of assemblages,
the binary matrix of occurrence of components, and
the vector of weight of different performances,
respectively.
Primary and secondary, fitted and validated trees, of component clustering and associated statistics:
tree.I, tree.II, nbOpt
:
the primary tree of component clustering,
the validated secondary tree of component clustering,
and the optimum number of functional clusters,
respectively.
A tree is a list of a square-matrix of dimensions
nbLev * nbElt
(with nbLev = nbElt
),
and of a vector of coefficient of determination (of length nbLev
).
mCal, mPrd, tCal, tPrd
:
the numeric matrix of modelled values,
and of values predicted by cross-validation,
using the primary tree (mCal
and (mPrd
)
or the secondary tree (tCal
and (tPrd
), respectively.
All matrices have the same dimension nbLev * nbAss
.
rownames
contains the number of component clusters,
that is from 1
to nbElt
clusters.
colnames
contains the names of assemblages.
mMotifs, tNbcl
: the matrix
of affectation of assemblages to different assembly motifs,
coded as integers, and the matrices of the last tree levels
used for predicting assemblage performances.
All matrices have the same dimension nbLev * nbAss
.
rownames
contains the number of component clusters,
that is from 1
to nbElt
clusters.
colnames
contains the names of assemblages.
mStats, tStats
: the matrices of associated statistics.
rownames
contains the number of component clusters,
that is from 1 to nbElt clusters.
colnames = c("missing", "R2cal", "R2prd", "AIC", "AICc")
.
see Vignette "The options of fclust".
Jaillard, B., Richon, C., Deleporte, P., Loreau, M. and Violle, C. (2018) An a posteriori species clustering for quantifying the effects of species interactions on ecosystem functioning. Methods in Ecology and Evolution, 9:704-715. https://doi.org/10.1111/2041-210X.12920.
Jaillard, B., Deleporte, P., Loreau, M. and Violle, C. (2018) A combinatorial analysis using observational data identifies species that govern ecosystem functioning. PLoS ONE 13(8): e0201135. https://doi.org/10.1371/journal.pone.0201135.
fclust
: build a functional clustering,
fclust_plot
: plot the results of a functional clustering,
fclust_write
: save the results of a functional clustering,
fclust_read
: read the results of a functional clustering.
# NOT RUN {
# Enable the comments
oldOption <- getOption("verbose")
if (!oldOption) options(verbose = TRUE)
nbElt <- 16 # number of components
# index = Identity, Occurrence of components, a Performance
index <- c(1, 1 + 1:nbElt, 1 + nbElt + 1)
dat.2004 <- CedarCreek.2004.2006.dat[ , index]
res <- fclust(dat.2004, nbElt)
names(res)
res$tree.II
options(verbose = oldOption)
# }
Run the code above in your browser using DataLab