The function computes and tests Temporal Beta-diversity Indices (TBI) between multivariate observations (frequency or presence-absence data) forming pairs observed at time 1 (T1) and time 2 (T2). The data matrices may contain abundance or presence-absence data, or other types of frequency-like data (e.g. biomasses). TBI are dissimilarity indices that measure beta differentiation through time. The indices are computed between T1 and T2 for each site. The difference between species (or abundances-per-species) gains (C/den) and losses (B/den) can be printed out and tested for significance.
TBI(
mat1,
mat2,
method = "%difference",
pa.tr = FALSE,
nperm = 99,
BCD = TRUE,
replace = FALSE,
test.BC = TRUE,
test.t.perm = FALSE,
save.BC = FALSE,
seed. = NULL,
clock = FALSE
)
Function TBI returns a list containing the following results:
TBI
The vector of Temporal Beta-diversity Indices
(TBI) between observations at times T1 and T2 for each object.
p.TBI
A corresponding vector of p-values. Significant p-values
(e.g. p.TBI <= 0.05) indicate exceptional objects for the difference of
their species composition.
p.adj
The p-values are corrected for multiple testing using
function p.adjust of stats
. The adjustment is done using
method="holm"
, which is the default option of the p.adjust
function.
BCD.mat
An output table with four columns: B/den, C/den,
D=(B+C)/den, and Change. The value den is the denominator of the index,
i.e. (2A+B+C) for the percentage difference index and (A+B+C) for the
Ruzicka index. The decomposition is such that D = B/den + C/den. Columns B
and C indicate which of the D values are associated with large B (losses)
or large C values (gains), before proceeding to the analysis and
interpretation of the D values, using environmental or spatial explanatory
variables, through regression or classification tree analysis. When B > C,
the site has lost species or abundances-per-species between time 1 and time
2; this is indicated by a "-" sign in column Change. On the contrary, if B
< C, the site has gained species or abundances-per-species between time 1
and time 2; this is indicated by a "+" sign in that column. Sites with
equal amounts of losses and gains are marked with a "0". - The B/den and
C/den values can be plotted in B-C plots, which are informative about the
changes that occurred in the data set between the two surveys under study.
- If pa.tr
is TRUE, the B and C components are the numbers of
spepcies losses and gains, and D is either the Sorensen or the Jaccard
dissimilarity. - If BCD=FALSE
, that table is not produced. No table
is (or can be) computed for indices other than the Ruzicka and percentage
difference indices or their binary forms.
BCD.summary
An output table with six columns: mean(B/den);
mean(C/den); mean(D); B/(B+C) (which is mean(B/den) divided by mean(D));
C/(B+C) (which is mean(C/den) divided by mean(D)). These values indicate,
over all sites, which of the processes dominated (loss or gain of species
or abundances-per-species) when site compositions changed between time 1
and time 2. Change has the same meaning as in table BCD.mat
; the
sign indicates the direction of the mean change over all sites.
t.test_B.C
The results of a paired t-test (parametric) of
significance of the difference between columns C/den and B/den of the
BCD.mat
table. If test.t.perm=TRUE
, the difference between
species gains (C/den) and losses (B/den) is also tested in a permutational
paired t-test and the permutational p-value is shown in the output table.
This result provides an overall test of the direction of change over all
sites. It helps confirm the asymmetry between species (or
abundances-per-species) gains (C/den) and species (or
abundances-per-species) losses (B/den). A star in column p<=0.05 indicates
a significant result of the parametric test at the 0.05 level.
BC
An output table with two columns: B and C. In this table,
the B and C statistics are not divided by a denominator, contrary to the
values B/den and C/den found in the output table BCD.mat
.
Two multivariate community composition or gene frequency data matrices (class data.frame or matrix) with the same number of rows and columns. The rows must correspond to the same objects (e.g. sites) and the columns to the same variables, e.g. species or alleles. – The input files are checked for having equal numbers of rows and columns, for rows that are empty in both mat1 and mat2, and for the presence of negative values, which cannot be frequencies.
One of the following dissimilarity coefficients:
"%difference"
(aka Bray-Curtis), "ruzicka"
, "chord"
,
"hellinger"
, "log.chord"
, "sorensen"
,
"jaccard"
, "ochiai"
, "euclidean"
. See Details. Names
can be abbreviated to a non-ambiguous set of first letters. Default:
method="%difference"
.
If pa.tr=TRUE
, the data are transformed to binary (i.e.
presence-absence, or pa) form. If pa.tr=FALSE
, they are not.
Number of permutations for the tests of significance of the
temporal beta indices and the permutation test of the B-C difference. Use
nperm
= 999 or 9999 in real studies.
If BCD=TRUE
, the B and C components of the
percentage difference (method="%difference"
) and Ruzicka
(method"ruzicka"
) indices are computed and presented in an output
table with three columns: B/den, C/den, D=(B+C)/den, where den is the
denominator of the index, i.e. (2A+B+C) for the percentage difference index
and (A+B+C) for the Ruzicka index. See Details and Value. If
pa.tr=TRUE
, the B and C components are the numbers of species lost
or gained, and D is either the Sorensen or the Jaccard dissimilarity. In
the BCD output table, column B contains B/den, C/den, D=(B+C)/den, as in
the case of the percentage difference and Ruzicka indices.
If BCD=FALSE
, that table is not produced. No table can be computed for
indices other than the Ruzicka and percentage difference or their binary
forms.
If FALSE
(default value),
sampling is done without replacement, producing a regular permutation test.
If TRUE
, sampling is done with replacement for the
test of significance; the method is then bootstrapping.
If TRUE
, the difference between
species (or abundances-per-species) gains (C/den) and species (or
abundances-per-species) losses (B/den) is tested in a parametric paired
t-test computed by function t.test
of stats
. If
FALSE
, the test is not computed.
If TRUE
, the difference
between species (or abundances-per-species) gains (C/den) and species (or
abundances-per-species) losses (B/den) is also tested in a permutational
paired t-test computed by function t.paired.perm
.If
FALSE
, the test is not computed.
If TRUE
, the original B and C
values are returned in a matrix called BC
, without division by den
as in the output matrix BCD.mat
. If FALSE
, BC
will get the value NA in the output list.
Seed for random number generator. If NULL
, the random
number generator keeps going from the point it had reached in previous
calculations. If seed.
is an integer value instead of NULL,
the random number generator is reset using that value. This allows users to
repeat exactly a previous calculation launched with the same value of
seed.
; the sequence of generated random numbers will be exactly the
same.
If clock=TRUE
, the computation time is printed. This
option is useful to predict the calculation time when n and
nperm are large.
Pierre Legendre pierre.legendre@umontreal.ca
For each object, the function tests the hypothesis (H0) that a species assemblage is not exceptionally different between T1 and T2, compared to assemblages that could have been observed at this site at T1 and T2 under conditions corresponding to H0. If H0 is rejected, the object is recognized as exceptionally different from the other objects for its difference between T1 and T2.
To fix ideas, an example in palaeoecology - A researcher is studying ancient and modern diatom communities in sediment cores. If a site displays an exceptional difference between T1 and T2, the researcher is justified to examine the reason for that difference. It could, for example, be caused by a change in land use at that site, which has caused the difference to be larger than at the other sites, compared to the differences caused by climate change at all sites.
The temporal beta diversity indices available in this function belong to four groups, computed in different ways.
Method
"%difference"
computes the percentage difference index, erroneously
called the Bray-Curtis index in some software packages ; it is the
quantitative form of the Sorensen index. Method "ruzicka"
computes the
Ruzicka dissimilarity; this is one of the quantitative coefficients
corresponding to the Jaccard dissimilarity for binary data. When these
indices are used to compute ordinations by principal coordinate analysis, it
is recommended to take the square root of the dissimilarities before the
ordination analysis because these indices do not have the property of being
Euclidean. However, that precaution is not important here; the results of
permutation tests will be the same for these dissimilarities, square-rooted
or not. If pa.tr=TRUE
, either the Sorensen or the Jaccard coefficient
are obtained by computing these two coefficients.
Methods
"chord"
(chord distance), "hellinger"
(Hellinger distance) and
"log.chord"
(log.chord distance) are obtained by transformation of the
species data, as described by Legendre & Borcard (2018), followed by
calculation of the Euclidean distance. For the Hellinger distance, the data
are square-rooted, then subjected to the chord transformation and the
Euclidean distance. For the log.chord distance, the data are transformed by
y' = log(y+1) using function log1p() of R, then subjected to the chord
transformation and the Euclidean distance. These three distances have the
Euclidean property (Legendre & Legendre 2012, Legendre & De Caceres 2013). If
pa.tr=TRUE
, the Ochiai distance for binary data,
sqrt(2)*sqrt(1-Ochiai similarity), is obtained from these three coefficients.
Methods "jaccard"
, "sorensen"
, "ochiai"
implement
the Jaccard, Sorensen and Ochiai dissimilarities. For these coefficients, the
data are first transformed to presence-absence (pa.tr
receives the
value TRUE
), then the dissimilarities are computed using the
corresponding quantitative coefficients (Ruzicka, percentage difference, and
chord).
The Euclidean distance is also available in this function. It is not recommended for community composition or allele frequency data. One can compute it for log-transformed abundance data that do not contain zeros, or very few zeros (short gradients).
The temporal beta indices are tested for significance using permutation tests. The hypotheses are the following:
H0: the site under study (e.g. a species assemblage) is not exceptionally different between T1 and T2, compared to assemblages that could have been observed at this site at T1 and T2 under conditions corresponding to H0. The differences between T1 and T2 all belong to the same statistical population of differences.
H1: the site under study is exceptionally different between times T1 and T2.
In the decomposition of the Ruzicka and percentage difference dissimilarities or their presence-absence forms (Jaccard, Sorensen), the components B and C are computed as follows:
bj is the part of the abundance of species j that is higher at time 1 than at time 2: bj = (y1j - y2j) if y1j > y2j ; bj = 0 otherwise. B is the sum of the bj values for all species in the group of species under study. It is the unscaled sum of species losses between time 1 and time 2. In the BCD output table BCD.mat, column 1 contains B/den where den is the denominator of the index, i.e. (2A+B+C) for the percentage difference index and (A+B+C) for the Ruzicka index.
cj is the part of the abundance of species j that is higher at time 2 than at time 1: cj = (y2j - y1j) if y2j > y1j ; cj = 0 otherwise. C is the sum of the cj values for all species in the group of species under study. It is the unscaled sum of species gains between time 1 and time 2. In the BCD output table BCD.mat, column 2 contains C/den where den is the denominator of the index, i.e. (2A+B+C) for the percentage difference index and (A+B+C) for the Ruzicka index.
The original values of B and C for each site, without denominator, are also available in the output table BC.
Warning - In real ecological studies, when the TBI test is applied to data where some sites are highly impoverished due to pollution or other extreme environmental situations, this situation may produce sites with very few species (i.e. very low richness) and no species in common for the T1-T2 comparisons due to sampling variation at these impoverished sites. The TBI dissimilarity will be high and the test may indicate a significant T1-T2 difference if most other sites have higher species richness. This would be a correct statistical outcome for the test. When users of the method identify sites showing significant TBI tests in data, they should check the species richness of these sites at T1 and T2. Interpretation of the test results should be done with caution when high and significant TBI indices are associated with very low richness and no species in common between T1 and T2.
Legendre, P. 2019. A temporal beta-diversity index to identify exceptional sites in space-time surveys. Ecology and Evolution (in press).
Legendre, P. & M. De Caceres. 2013. Beta diversity as the variance of community data: dissimilarity coefficients and partitioning. Ecology Letters 16: 951-963.
Legendre, P. & D. Borcard. 2018. Box-Cox-chord transformations for community composition data prior to beta diversity analysis. Ecography 41: 1820-1824.
Legendre, P. & L. Legendre. 2012. Numerical Ecology. 3rd English edition. Elsevier Science BV, Amsterdam.
van den Brink, P. J. & C. J. F. ter Braak. 1999. Principal response curves: analysis of time-dependent multivariate responses of biological community to stress. Environmental Toxicology and Chemistry 18: 138-148.
plot.TBI
if(require("vegan", quietly = TRUE)) {
## Example 1 -
## Invertebrate communities subjected to insecticide treatment.
## As an example in their paper on Principal Response Curves (PRC method), van den
## Brink & ter Braak (1999) used observations on the abundances of 178 invertebrate
## species (macroinvertebrates and zooplankton) subjected to treatments in 12 mesocosms by
## the insecticide chlorpyrifos. The mesocosms were sampled at 11 occasions. The data,
## available in the {vegan} package, are log-transformed species abundances, ytranformed =
## log(10*y+1).
## The data of survey #4 will be compared to those of survey #11 in this example.
## Survey #4 was carried out one week after the insecticide treatment, whereas the fauna
## of the mesocosms was considered by the authors to have fully recovered from the
## insecticide treatment at survey #11.
data(pyrifos)
## The mesocosms had originally been attributed at random to the treatments. However,
## to facilitate presentation of the results, they will be listed here in order of
## increased insecticide doses: {0, 0, 0, 0, 0.1, 0.1, 0.9, 0.9, 6, 6, 44, 44} micro g/L.
## Select the 12 data rows of surveys 4 and 11 from the data file and reorder them
ord4 = c(38,39,41,47,37,44,40,46,43,48,42,45)
ord11 = c(122,123,125,131,121,128,124,130,127,132,126,129)
## Run the TBI function
res1 <- TBI(pyrifos[ord4,], pyrifos[ord11,], method = "%diff", nperm = 0, test.t.perm = FALSE)
res1$BCD.mat
## Example 2 -
## This example uses the mite data available in vegan. Let us pretend that sites 1-20
## represent T1 and sites 21-40 represent T2.
data(mite)
# Run the TBI function
res2 <- TBI(mite[1:20,], mite[21:40,], method = "%diff", nperm = 0, test.t.perm = FALSE)
summary(res2)
res2$BCD.mat
}
Run the code above in your browser using DataLab