dbFD
implements a flexible distance-based framework to compute multidimensional functional diversity (FD) indices. dbFD
returns the three FD indices of Vill<e9>ger et al. (2008): functional richness (FRic), functional evenness (FEve), and functional divergence (FDiv), as well functional dispersion (FDis; Lalibert<e9> and Legendre 2010), Rao's quadratic entropy (Q) (Botta-Duk<e1>t 2005), a posteriori functional group richness (FGR) (Petchey and Gaston 2006), and the community-level weighted means of trait values (CWM; e.g. Lavorel et al. 2008). Some of these FD indices consider species abundances. dbFD
includes several options for flexibility.
dbFD(x, a, w, w.abun = TRUE, stand.x = TRUE,
ord = c("podani", "metric"), asym.bin = NULL,
corr = c("sqrt", "cailliez", "lingoes", "none"),
calc.FRic = TRUE, m = "max", stand.FRic = FALSE,
scale.RaoQ = FALSE, calc.FGR = FALSE, clust.type = "ward",
km.inf.gr = 2, km.sup.gr = nrow(x) - 1, km.iter = 100,
km.crit = c("calinski", "ssi"), calc.CWM = TRUE,
CWM.type = c("dom", "all"), calc.FDiv = TRUE, dist.bin = 2,
print.pco = FALSE, messages = TRUE)
matrix or data frame of functional traits. Traits can be numeric
, ordered
, or factor
. Binary traits should be numeric
and only contain 0 and 1. character
traits will be converted to factor
. NA
s are tolerated.
x
can also be a species-by-species distance matrix of class dist
, in which case NAs
are not allowed.
When there is only one trait, x
can be also be a numeric
vector, an ordered
factor, or a unordered factor
.
In all cases, species labels are required.
matrix containing the abundances of the species in x
(or presence-absence, i.e. 0 or 1). Rows are sites and species are columns. Can be missing, in which case dbFD
assumes that there is only one community with equal abundances of all species. NAs
will be replaced by 0.
The number of species (columns) in a
must match the number of species (rows) in x
. In addition, the species labels in a
and x
must be identical and in the same order.
vector listing the weights for the traits in x
. Can be missing, in which case all traits have equal weights.
logical; should FDis, Rao's Q, FEve, FDiv, and CWM be weighted by the relative abundances of the species?
vector listing the asymmetric binary variables in x
. See gowdis
for more details.
character string specifying the correction method to use when the species-by-species distance matrix cannot be represented in a Euclidean space. Options are "sqrt"
, "cailliez"
, "lingoes"
, or "none"
. Can be abbreviated. Default is "sqrt"
. See ‘details’ section.
logical; should FRic be computed?
the number of PCoA axes to keep as ‘traits’ for calculating FRic (when FRic is measured as the convex hull volume) and FDiv. Options are: any integer \(>1\), "min"
(maximum number of traits that allows the \(s \geq 2^t\) condition to be met, where \(s\) is the number of species and \(t\) the number of traits), or "max"
(maximum number of axes that allows the \(s > t\) condition to be met). See ‘details’ section.
logical; should FRic be standardized by the ‘global’ FRic that include all species, so that FRic is constrained between 0 and 1?
logical; should Rao's Q be scaled by its maximal value over all frequency distributions? See divc
.
logical; should FGR be computed?
character string specifying the clustering method to be used to create the dendrogram of species for FGR. Options are "ward"
, "single"
, "complete"
, "average"
, "mcquitty"
, "median"
, "centroid"
, and "kmeans"
. For "kmeans"
, other arguments also apply (km.inf.fr
, km.sup.gr
, km.iter
, and km.crit
). See hclust
and cascadeKM
for more details.
the number of groups for the partition with the smallest number of groups of the cascade (min). Only applies if calc.FGR
is TRUE
and clust.type
is "kmeans"
. See cascadeKM
for more details.
the number of groups for the partition with the largest number of groups of the cascade (max). Only applies if calc.FGR
is TRUE
and clust.type
is "kmeans"
. See cascadeKM
for more details.
the number of random starting configurations for each value of \(K\). Only applies if calc.FGR
is TRUE
and clust.type
is "kmeans"
. See cascadeKM
for more details.
criterion used to select the best partition. The default value is "calinski"
(Calinski-Harabasz 1974). The simple structure index "ssi"
is also available. Only applies if calc.FGR
is TRUE
and clust.type
is "kmeans"
. Can be abbreviated. See cascadeKM
for more details.
logical; should the community-level weighted means of trait values (CWM) be calculated? Can be abbreviated. See functcomp
for more details.
character string indicating how nominal, binary and ordinal traits should be handled for CWM. See functcomp
for more details.
logical; should FDiv be computed?
only applies when x
is a single unordered factor
, in which case x
is coded using dummy variables. dist.bin
is an integer between 1 and 10 specifying the appropriate distance measure for binary data. 2 (the default) refers to the simple matching coefficient (Sokal and Michener 1958). See dist.binary
for the other options.
logical; should the eigenvalues and PCoA axes be returned?
logical; should warning messages be printed in the console?
vector listing the number of species in each community
vector listing the number of functionally singular species in each community. If all species are functionally different, sing.sp
will be identical to nbsp
.
vector listing the FRic of each community
quality of the reduced-space representation required to compute FRic and FDiv.
vector listing the FEve of each community
vector listing the FDiv of each community. Only returned if calc.FDiv
is TRUE
.
vector listing the FDis of each community
vector listing the Rao's quadratic entropy (Q) of each community
vector listing the FGR of each community. Only returned if calc.FGR
is TRUE
.
vector specifying functional group membership for each species. Only returned if calc.FGR
is TRUE
.
matrix containing the abundances of each functional group in each community. Only returned if calc.FGR
is TRUE
.
data frame containing the community-level weighted trait means (CWM). Only returned if calc.CWM
is TRUE
.
eigenvalues from the PCoA. Only returned if print.pco
is TRUE
.
PCoA axes. Only returned if print.pco
is TRUE
.
Users often report that dbFD
crashed during their analysis. Generally this occurs under Windows, and is almost always due to the computation of convex hull volumes. Possible solutions are to choose calc.FRic = "FALSE"
, or to reduce the dimensionality of the trait matrix using the "m"
argument.
Typical usage is
dbFD(x, a, \dots)
If x
is a matrix or a data frame that contains only continuous traits, no NAs
, and that no weights are specified (i.e. w
is missing), a species-species Euclidean distance matrix is computed via dist
. Otherwise, a Gower dissimilarity matrix is computed via gowdis
. If x
is a distance matrix, it is taken as is.
When x
is a single trait, species with NAs
are first excluded to avoid NAs
in the distance matrix. If x
is a single continuous trait (i.e. of class numeric
), a species-species Euclidean distance matrix is computed via dist
. If x
is a single ordinal trait (i.e. of class ordered
), gowdis
is used and argument ord
applies. If x
is a single nominal trait (i.e. an unordered factor
), the trait is converted to dummy variables and a distance matrix is computed via dist.binary
, following argument dist.bin
.
Once the species-species distance matrix is obtained, dbFD
checks whether it is Euclidean. This is done via is.euclid
. PCoA axes corresponding to negative eigenvalues are imaginary axes that cannot be represented in a Euclidean space, but simply ignoring these axes would lead to biased estimations of FD. Hence in dbFD
one of four correction methods are used, following argument corr
. "sqrt"
simply takes the square root of the distances. However, this approach does not always work for all coefficients, in which case dbFD
will stop and tell the user to select another correction method. "cailliez"
refers to the approach described by Cailliez (1983) and is implemented via cailliez
. "lingoes"
refers to the approach described by Lingoes (1971) and is implemented via lingoes
. "none"
creates a distance matrix with only the positive eigenvalues of the Euclidean representation via quasieuclid
. See Legendre and Legendre (1998) and Legendre and Anderson (1999) for more details on these corrections.
Principal coordinates analysis (PCoA) is then performed (via dudi.pco
) on the corrected species-species distance matrix. The resulting PCoA axes are used as the new ‘traits’ to compute the three indices of Vill<e9>ger et al. (2008): FRic, FEve, and FDiv. For FEve, there is no limit on the number of traits that can be used, so all PCoA axes are used. On the other hand, FRic and FDiv both rely on finding the minimum convex hull that includes all species (Vill<e9>ger et al. 2008). This requires more species than traits. To circumvent this problem, dbFD
takes only a subset of the PCoA axes as traits via argument m
. This, however, comes at a cost of loss of information. The quality of the resulting reduced-space representation is returned by qual.FRic
, which is computed as described by Legendre and Legendre (1998) and can be interpreted as a \(R^2\)-like ratio.
In dbFD
, FRic is generally measured as the convex hull volume, but when there is only one continuous trait it is measured as the range (or the range of the ranks for an ordinal trait). Conversely, when only nominal and ordinal traits are present, FRic is measured as the number of unique trait value combinations in a community. FEve and FDiv, but not FRic, can account for species relative abundances, as described by Vill<e9>ger et al. (2008).
Functional dispersion (FDis; Lalibert<e9> and Legendre 2010) is computed from the uncorrected species-species distance matrix via fdisp
. Axes with negatives eigenvalues are corrected following the approach of Anderson (2006). When all species have equal abundances (i.e. presence-absence data), FDis is simply the average distance to the centroid (i.e. multivariate dispersion) as originally described by Anderson (2006). Multivariate dispersion has been proposed as an index of beta diversity (Anderson et al. 2006). However, Lalibert<e9> and Legendre (2010) have extended it to a FD index. FDis can account for relative abundances by shifting the position of the centroid towards the most abundant species, and then computing a weighted average distance to this new centroid, using again the relative abundances as weights (Lalibert<e9> and Legendre 2010). FDis has no upper limit and requires at least two species to be computed. For communities composed of only one species, dbFD
returns a FDis value of 0. FDis is by construction unaffected by species richness, it can be computed from any distance or dissimilarity measure (Anderson et al. 2006), it can handle any number and type of traits (including more traits than species), and it is not strongly influenced by outliers.
Rao's quadratic entropy (Q) is computed from the uncorrected species-species distance matrix via divc
. See Botta-Duk<e1>t (2005) for details. Rao's Q is conceptually similar to FDis, and simulations (via simul.dbFD
) have shown high positive correlations between the two indices (Lalibert<e9> and Legendre 2010). Still, one potential advantage of FDis over Rao's Q is that in the unweighted case (i.e. with presence-absence data), it opens possibilities for formal statistical tests for differences in FD between two or more communities through a distance-based test for homogeneity of multivariate dispersions (Anderson 2006); see betadisper
for more details.
Functional group richness (FGR) is based on the classification of the species by the user from visual inspection of a dengrogram. Method "kmeans"
is also available by calling cascadeKM
. In that case, the Calinski-Harabasz (1974) criterion or the simple structure index (SSI) can be used to estimate the number of functional groups; see cascadeKM
for more details. FGR returns the number of functional groups per community, as well as the abundance of each group in each community.
The community-level means of trait values (CWM) is an index of functional composition (Lavorel et al. 2008), and is computed via functcomp
. Species with NAs
for a given trait are excluded for that trait.
Anderson, M. J. (2006) Distance-based tests for homogeneity of multivariate dispersions. Biometrics 62:245-253.
Anderson, M. J., K. E. Ellingsen and B. H. McArdle (2006) Multivariate dispersion as a measure of beta diversity. Ecology Letters 9:683-693.
Botta-Duk<e1>t, Z. (2005) Rao's quadratic entropy as a measure of functional diversity based on multiple traits. Journal of Vegetation Science 16:533-540.
Cailliez, F. (1983) The analytical solution of the additive constant problem. Psychometrika 48:305-310.
Calinski, T. and J. Harabasz (1974) A dendrite method for cluster analysis. Communications in Statistics 3:1-27.
Gower, J. C. (1971) A general coefficient of similarity and some of its properties. Biometrics 27:857-871.
Lalibert<e9>, E. and P. Legendre (2010) A distance-based framework for measuring functional diversity from multiple traits. Ecology 91:299-305.
Lavorel, S., K. Grigulis, S. McIntyre, N. S. G. Williams, D. Garden, J. Dorrough, S. Berman, F. Qu<e9>tier, A. Thebault and A. Bonis (2008) Assessing functional diversity in the field - methodology matters! Functional Ecology 22:134-147.
Legendre, P. and M. J. Anderson (1999) Distance-based redundancy analysis: testing multispecies responses in multifactorial ecological experiments. Ecological Monographs 69:1-24.
Legendre, P. and L. Legendre (1998) Numerical Ecology. 2nd English edition. Amsterdam: Elsevier.
Lingoes, J. C. (1971) Some boundary conditions for a monotone analysis of symmetric matrices. Psychometrika 36:195-203.
Podani, J. (1999) Extending Gower's general coefficient of similarity to ordinal characters. Taxon 48:331-340.
Sokal, R. R. and C. D. Michener (1958) A statistical method for evaluating systematic relationships. The University of Kansas Scientific Bulletin 38:1409-1438.
Vill<e9>ger, S., N. W. H. Mason and D. Mouillot (2008) New multidimensional functional diversity indices for a multifaceted framework in functional ecology. Ecology 89:2290-2301.
gowdis
, functcomp
, fdisp
, simul.dbFD
, divc
, treedive
, betadisper
# NOT RUN {
# mixed trait types, NA's
ex1 <- dbFD(dummy$trait, dummy$abun)
ex1
# add variable weights
# 'cailliez' correction is used because 'sqrt' does not work
w<-c(1, 5, 3, 2, 5, 2, 6, 1)
ex2 <- dbFD(dummy$trait, dummy$abun, w, corr="cailliez")
# if 'x' is a distance matrix
trait.d <- gowdis(dummy$trait)
ex3 <- dbFD(trait.d, dummy$abun)
ex3
# one numeric trait, one NA
num1 <- dummy$trait[,1] ; names(num1) <- rownames(dummy$trait)
ex4 <- dbFD(num1, dummy$abun)
ex4
# one ordered trait, one NA
ord1 <- dummy$trait[,5] ; names(ord1) <- rownames(dummy$trait)
ex5 <- dbFD(ord1, dummy$abun)
ex5
# one nominal trait, one NA
fac1 <- dummy$trait[,3] ; names(fac1) <- rownames(dummy$trait)
ex6 <- dbFD(fac1, dummy$abun)
ex6
# example with real data from New Zealand short-tussock grasslands
# 'lingoes' correction used because 'sqrt' does not work in that case
ex7 <- dbFD(tussock$trait, tussock$abun, corr = "lingoes")
# }
# NOT RUN {
# calc.FGR = T, 'ward'
ex7 <- dbFD(dummy$trait, dummy$abun, calc.FGR = T)
ex7
# calc.FGR = T, 'kmeans'
ex8 <- dbFD(dummy$trait, dummy$abun, calc.FGR = T,
clust.type = "kmeans")
ex8
# ward clustering to compute FGR
ex9 <- dbFD(tussock$trait, tussock$abun,
corr = "cailliez", calc.FGR = TRUE, clust.type = "ward")
# choose 'g' for number of groups
# 6 groups seems to make good ecological sense
ex9
# however, calinksi criterion in 'kmeans' suggests
# that 6 groups may not be optimal
ex10 <- dbFD(tussock$trait, tussock$abun, corr = "cailliez",
calc.FGR = TRUE, clust.type = "kmeans", km.sup.gr = 10)
# }
Run the code above in your browser using DataLab