dispweight: Dispersion-based weighting of species counts

Description

Transform abundance data downweighting species that are overdispersed to the Poisson error.

Usage

dispweight(comm, groups, nsimul = 999, nullmodel = "c0_ind",
    plimit = 0.05)
gdispweight(formula, data, plimit = 0.05)
# S3 method for dispweight
summary(object, ...)

Value

Function returns transformed data with the following new attributes:

D: Dispersion statistic.
df: Degrees of freedom for each species.
p: \(p\)-value of the Dispersion statistic \(D\).
weights: weights applied to community data.
nsimul: Number of simulations used to assess the \(p\)-value, or NA when simulations were not performed.
nullmodel: The name of commsim null model, or NA when simulations were not performed.

Arguments

comm: Community data matrix.
groups: Factor describing the group structure. If missing, all sites are regarded as belonging to one group. NA values are not allowed.
nsimul: Number of simulations.
nullmodel: The nullmodel used in commsim within groups. The default follows Clarke et al. (2006).
plimit: Downweight species if their \(p\)-value is at or below this limit.
formula, data: Formula where the left-hand side is the community data frame and right-hand side gives the explanatory variables. The explanatory variables are found in the data frame given in data or in the parent frame.
object: Result object from dispweight or gdispweight.
...: Other parameters passed to functions.

Author

Eduard Szöcs eduardszoesc@gmail.com wrote the original dispweight, Jari Oksanen significantly modified the code, provided support functions and developed gdispweight.

Details

The dispersion index (\(D\)) is calculated as ratio between variance and expected value for each species. If the species abundances follow Poisson distribution, expected dispersion is \(E(D) = 1\), and if \(D > 1\), the species is overdispersed. The inverse \(1/D\) can be used to downweight species abundances. Species are only downweighted when overdispersion is judged to be statistically significant (Clarke et al. 2006).

Function dispweight implements the original procedure of Clarke et al. (2006). Only one factor can be used to group the sites and to find the species means. The significance of overdispersion is assessed freely distributing individuals of each species within factor levels. This is achieved by using nullmodel "c0_ind" (which accords to Clarke et al. 2006), but other nullmodels can be used, though they may not be meaningful (see commsim for alternatives). If a species is absent in some factor level, the whole level is ignored in calculation of overdispersion, and the number of degrees of freedom can vary among species. The reduced number of degrees of freedom is used as a divisor for overdispersion \(D\), and such species have higher dispersion and hence lower weights in transformation.

Function gdispweight is a generalized parametric version of dispweight. The function is based on glm with quasipoisson error family. Any glm model can be used, including several factors or continuous covariates. Function gdispweight uses the same test statistic as dispweight (Pearson Chi-square), but it does not ignore factor levels where species is absent, and the number of degrees of freedom is equal for all species. Therefore transformation weights can be higher than in dispweight. The gdispweight function evaluates the significance of overdispersion parametrically from Chi-square distribution (pchisq).

Functions dispweight and gdispweight transform data, but they add information on overdispersion and weights as attributes of the result. The summary can be used to extract and print that information.

References

Clarke, K. R., M. G. Chapman, P. J. Somerfield, and H. R. Needham. 2006. Dispersion-based weighting of species counts in assemblage analyses. Marine Ecology Progress Series, 320, 11–27.

Examples

Run this code

data(mite, mite.env)
## dispweight and its summary
mite.dw <- with(mite.env, dispweight(mite, Shrub, nsimul = 99))
## IGNORE_RDIFF_BEGIN
summary(mite.dw)
## IGNORE_RDIFF_END
## generalized dispersion weighting
mite.dw <- gdispweight(mite ~ Shrub + WatrCont, data = mite.env)
rda(mite.dw ~ Shrub + WatrCont, data = mite.env)

Run the code above in your browser using DataLab