metaMDS: Nonmetric Multidimensional Scaling with Stable Solution from Random Starts, Axis Scaling and Species Scores

Description

Function metaMDS performs Nonmetric Multidimensional Scaling (NMDS), and tries to find a stable solution using several random starts. In addition, it standardizes the scaling in the result, so that the configurations are easier to interpret, and adds species scores to the site ordination. The metaMDS function does not provide actual NMDS, but it calls another function for the purpose. Currently monoMDS is the default choice, and it is also possible to call the isoMDS (MASS package).

Usage

metaMDS(comm, distance = "bray", k = 2, try = 20, trymax = 20, 
    engine = c("monoMDS", "isoMDS"), autotransform =TRUE,
    noshare = (engine == "isoMDS"), wascores = TRUE, expand = TRUE, 
    trace = 1, plot = FALSE, previous.best,  ...)
# S3 method for metaMDS
plot(x, display = c("sites", "species"), choices = c(1, 2),
    type = "p", shrink = FALSE, ...)
# S3 method for metaMDS
points(x, display = c("sites", "species"),
    choices = c(1,2), shrink = FALSE, select, ...)
# S3 method for metaMDS
text(x, display = c("sites", "species"), labels, 
    choices = c(1,2), shrink = FALSE, select, ...)
# S3 method for metaMDS
scores(x, display = c("sites", "species"), shrink = FALSE, 
    choices, tidy = FALSE, ...)
metaMDSdist(comm, distance = "bray", autotransform = TRUE, 
    noshare = TRUE, trace = 1, commname, zerodist = "ignore", 
    distfun = vegdist, ...)
metaMDSiter(dist, k = 2, try = 20, trymax = 20, trace = 1, plot = FALSE, 
    previous.best, engine = "monoMDS", maxit = 200,
    parallel = getOption("mc.cores"), ...)   
initMDS(x, k=2)
postMDS(X, dist, pc=TRUE, center=TRUE, halfchange, threshold=0.8,
    nthreshold=10, plot=FALSE, ...)
metaMDSredist(object, ...)

Value

Function metaMDS returns an object of class

metaMDS. The final site ordination is stored in the item

points, and species ordination in the item species, and the stress in item stress (NB, the scaling of the stress depends on the engine: isoMDS uses percents, and monoMDS proportions in the range \(0 \ldots 1\)). The other items store the information on the steps taken and the items returned by the engine function. The object has

print, plot, points and text methods. Functions metaMDSdist and metaMDSredist return

vegdist objects. Function initMDS returns a random configuration which is intended to be used within

isoMDS only. Functions metaMDSiter and

postMDS returns the result of NMDS with updated configuration.

Arguments

comm: Community data. Alternatively, dissimilarities either as a dist structure or as a symmetric square matrix. In the latter case all other stages are skipped except random starts and centring and pc rotation of axes.
distance: Dissimilarity index used in vegdist.
k: Number of dimensions. NB., the number of points \(n\) should be \(n > 2k + 1\), and preferably higher in global non-metric MDS, and still higher in local NMDS.
try, trymax: Minimum and maximum numbers of random starts in search of stable solution. After try has been reached, the iteration will stop when similar solutions were repeated or trymax was reached.
engine: The function used for MDS. The default is to use the monoMDS function in vegan, but for backward compatibility it is also possible to use isoMDS of MASS.
autotransform: Use simple heuristics for possible data transformation of typical community data (see below). If you do not have community data, you should probably set autotransform = FALSE.
noshare: Triggering of calculation step-across or extended dissimilarities with function stepacross. The argument can be logical or a numerical value greater than zero and less than one. If TRUE, extended dissimilarities are used always when there are no shared species between some sites, if FALSE, they are never used. If noshare is a numerical value, stepacross is used when the proportion of site pairs with no shared species exceeds noshare. The number of pairs with no shared species is found with no.shared function, and noshare has no effect if input data were dissimilarities instead of community data.
wascores: Calculate species scores using function wascores.
expand: Expand weighted averages of species in wascores.
trace: Trace the function; trace = 2 or higher will be more voluminous.
plot: Graphical tracing: plot interim results. You may want to set par(ask = TRUE) with this option.
previous.best: Start searches from a previous solution.
x: metaMDS result (or a dissimilarity structure for initMDS).
choices: Axes shown.
type: Plot type: "p" for points, "t" for text, and "n" for axes only.
display: Display "sites" or "species".
shrink: Shrink back species scores if they were expanded originally.
tidy: Return scores that are compatible with ggplot2: all scores are in a single data.frame, score type is identified by factor variable code ("sites" or "species"), the names by variable label. These scores are incompatible with conventional plot functions, but they can be used in ggplot2.
labels: Optional test to be used instead of row names.
select: Items to be displayed. This can either be a logical vector which is TRUE for displayed items or a vector of indices of displayed items.
X: Configuration from multidimensional scaling.
commname: The name of comm: should not be given if the function is called directly.
zerodist: Handling of zero dissimilarities: either "fail" or "add" a small positive value, or "ignore". monoMDS accepts zero dissimilarities and the default is zerodist = "ignore", but with isoMDS you may need to set zerodist = "add".
distfun: Dissimilarity function. Any function returning a dist object and accepting argument method can be used (but some extra arguments may cause name conflicts).
maxit: Maximum number of iterations in the single NMDS run; passed to the engine function monoMDS or isoMDS.
parallel: Number of parallel processes or a predefined socket cluster. If you use pre-defined socket clusters (say, clus), you must issue clusterEvalQ(clus, library(vegan)) to make available internal vegan functions. With parallel = 1 uses ordinary, non-parallel processing. The parallel processing is done with parallel package.
dist: Dissimilarity matrix used in multidimensional scaling.
pc: Rotate to principal components.
center: Centre the configuration.
halfchange: Scale axes to half-change units. This defaults TRUE when dissimilarities are known to have a theoretical maximum value (ceiling). Function vegdist will have that information in attribute maxdist, and for other distfun this is interpreted in a simple test (that can fail), and the information may not available when input data are distances. If FALSE, the ordination dissimilarities are scaled to the same range as the input dissimilarities.
threshold: Largest dissimilarity used in half-change scaling. If dissimilarities have a known (or inferred) ceiling, threshold is relative to that ceiling (see halfchange).
nthreshold: Minimum number of points in half-change scaling.
object: A result object from metaMDS.
...: Other parameters passed to functions. Function metaMDS passes all arguments to its component functions metaMDSdist, metaMDSiter, postMDS, and to distfun and engine.

Results Could Not Be Repeated

Non-linear optimization is a hard task, and the best possible solution (“global optimum”) may not be found from a random starting configuration. Most software solve this by starting from the result of metric scaling (cmdscale). This will probably give a good result, but not necessarily the “global optimum”. Vegan does the same, but metaMDS tries to verify or improve this first solution (“try 0”) using several random starts and seeing if the result can be repeated or improved and the improved solution repeated. If this does not succeed, you get a message that the result could not be repeated. However, the result will be at least as good as the usual standard strategy of starting from metric scaling or it may be improved. You may not need to do anything after such a message, but you can be satisfied with the result. If you want to be sure that you probably have a “global optimum” you may try the following instructions.

With default engine = "monoMDS" the function will tabulate the stopping criteria used, so that you can see which criterion should be made more stringent. The criteria can be given as arguments to metaMDS and their current values are described in monoMDS. In particular, if you reach the maximum number of iterations, you should increase the value of maxit. You may ask for a larger number of random starts without losing the old ones giving the previous solution in argument previous.best.

In addition to slack convergence criteria and too low number of random starts, wrong number of dimensions (argument k) is the most common reason for not being able to repeat similar solutions. NMDS is usually run with a low number dimensions (k=2 or k=3), and for complex data increasing k by one may help. If you run NMDS with much higher number of dimensions (say, k=10 or more), you should reconsider what you are doing and drastically reduce k. For very heterogeneous data sets with partial disjunctions, it may help to set stepacross, but for most data sets the default weakties = TRUE is sufficient.

Please note that you can give all arguments of other metaMDS* functions and NMDS engine (default monoMDS) in your metaMDS command,and you should check documentation of these functions for details.

Common Wrong Claims

NMDS is often misunderstood and wrong claims of its properties are common on the Web and even in publications. It is often claimed that the NMDS configuration is non-metric which means that you cannot fit environmental variables or species onto that space. This is a false statement. In fact, the result configuration of NMDS is metric, and it can be used like any other ordination result. In NMDS the rank orders of Euclidean distances among points in ordination have a non-metric monotone relationship to any observed dissimilarities. The transfer function from observed dissimilarities to ordination distances is non-metric (Kruskal 1964a, 1964b), but the ordination result configuration is metric and observed dissimilarities can be of any kind (metric or non-metric).

The ordination configuration is usually rotated to principal components in metaMDS. The rotation is performed after finding the result, and it only changes the direction of the reference axes. The only important feature in the NMDS solution are the ordination distances, and these do not change in rotation. Similarly, the rank order of distances does not change in uniform scaling or centring of configuration of points. You can also rotate the NMDS solution to external environmental variables with MDSrotate. This rotation will also only change the orientation of axes, but will not change the configuration of points or distances between points in ordination space.

Function stressplot displays the method graphically: it plots the observed dissimilarities against distances in ordination space, and also shows the non-metric monotone regression.

Author

Jari Oksanen

Warning

metaMDS uses monoMDS as its NMDS engine from vegan version 2.0-0, when it replaced the isoMDS function. You can set argument engine to select the old engine.

Details

Non-metric Multidimensional Scaling (NMDS) is commonly regarded as the most robust unconstrained ordination method in community ecology (Minchin 1987). Function metaMDS is a wrapper function that calls several other functions to combine Minchin's (1987) recommendations into one command. The complete steps in metaMDS are:

Transformation: If the data values are larger than common abundance class scales, the function performs a Wisconsin double standardization (wisconsin). If the values look very large, the function also performs sqrt transformation. Both of these standardizations are generally found to improve the results. However, the limits are completely arbitrary (at present, data maximum 50 triggers sqrt and \(>9\) triggers wisconsin). If you want to have a full control of the analysis, you should set autotransform = FALSE and standardize and transform data independently. The autotransform is intended for community data, and for other data types, you should set autotransform = FALSE. This step is perfomed using metaMDSdist, and the step is skipped if input were dissimilarities.
Choice of dissimilarity: For a good result, you should use dissimilarity indices that have a good rank order relation to ordering sites along gradients (Faith et al. 1987). The default is Bray-Curtis dissimilarity, because it often is the test winner. However, any other dissimilarity index in vegdist can be used. Function rankindex can be used for finding the test winner for you data and gradients. The default choice may be bad if you analyse other than community data, and you should probably select an appropriate index using argument distance. This step is performed using metaMDSdist, and the step is skipped if input were dissimilarities.
Step-across dissimilarities: Ordination may be very difficult if a large proportion of sites have no shared species. In this case, the results may be improved with stepacross dissimilarities, or flexible shortest paths among all sites. The default NMDS engine is monoMDS which is able to break tied values at the maximum dissimilarity, and this often is sufficient to handle cases with no shared species, and therefore the default is not to use stepacross with monoMDS. Function isoMDS does not handle tied values adequately, and therefore the default is to use stepacross always when there are sites with no shared species with engine = "isoMDS". The stepacross is triggered by option noshare. If you do not like manipulation of original distances, you should set noshare = FALSE. This step is skipped if input data were dissimilarities instead of community data. This step is performed using metaMDSdist, and the step is skipped always when input were dissimilarities.
NMDS with random starts: NMDS easily gets trapped into local optima, and you must start NMDS several times from random starts to be confident that you have found the global solution. The strategy in metaMDS is to first run NMDS starting with the metric scaling (cmdscale which usually finds a good solution but often close to a local optimum), or use the previous.best solution if supplied, and take its solution as the standard (Run 0). Then metaMDS starts NMDS from several random starts (minimum number is given by try and maximum number by trymax). These random starts are generated by initMDS. If a solution is better (has a lower stress) than the previous standard, it is taken as the new standard. If the solution is better or close to a standard, metaMDS compares two solutions using Procrustes analysis (function procrustes with option symmetric = TRUE). If the solutions are very similar in their Procrustes rmse and the largest residual is very small, the solutions are regarded as repeated and the better one is taken as the new standard. The conditions are stringent, and you may have found good and relatively similar solutions although the function is not yet satisfied. Setting trace = TRUE will monitor the final stresses, and plot = TRUE will display Procrustes overlay plots from each comparison. This step is performed using metaMDSiter. This is the first step performed if input data (comm) were dissimilarities. Random starts can be run with parallel processing (argument parallel).
Scaling of the results: metaMDS will run postMDS for the final result. Function postMDS provides the following ways of “fixing” the indeterminacy of scaling and orientation of axes in NMDS: Centring moves the origin to the average of the axes; Principal components rotate the configuration so that the variance of points is maximized on first dimension (with function MDSrotate you can alternatively rotate the configuration so that the first axis is parallel to an environmental variable); Half-change scaling scales the configuration so that one unit means halving of community similarity from replicate similarity. Half-change scaling is based on closer dissimilarities where the relation between ordination distance and community dissimilarity is rather linear (the limit is set by argument threshold). If there are enough points below this threshold (controlled by the parameter nthreshold), dissimilarities are regressed on distances. The intercept of this regression is taken as the replicate dissimilarity, and half-change is the distance where similarity halves according to linear regression. Obviously the method is applicable only for dissimilarity indices scaled to \(0 \ldots 1\), such as Kulczynski, Bray-Curtis and Canberra indices. If half-change scaling is not used, the ordination is scaled to the same range as the original dissimilarities. Half-change scaling is skipped by default if input were dissimilarities, but can be turned on with argument halfchange = TRUE. NB., The PC rotation only changes the directions of reference axes, and it does not influence the configuration or solution in general.
Species scores: Function adds the species scores to the final solution as weighted averages using function wascores with given value of parameter expand. The expansion of weighted averages can be undone with shrink = TRUE in plot or scores functions, and the calculation of species scores can be suppressed with wascores = FALSE. This step is skipped if input were dissimilarities and community data were unavailable. However, the species scores can be added or replaced with sppscores.

References

Faith, D. P, Minchin, P. R. and Belbin, L. (1987). Compositional dissimilarity as a robust measure of ecological distance. Vegetatio 69, 57--68.

Kruskal, J.B. (1964a). Multidimensional scaling by optimizing goodness-of-fit to a nonmetric hypothesis. Psychometrika 29, 1--28.

Kruskal, J.B. (1964b). Nonmetric multidimensional scaling: a numerical method. Psychometrika 29, 115--129.

Minchin, P.R. (1987). An evaluation of relative robustness of techniques for ecological ordinations. Vegetatio 69, 89--107.

Examples

Run this code

## The recommended way of running NMDS (Minchin 1987)
##
data(dune)
## IGNORE_RDIFF_BEGIN
## Global NMDS using monoMDS
sol <- metaMDS(dune)
sol
plot(sol, type="t")
## Start from previous best solution
sol <- metaMDS(dune, previous.best = sol)
## Local NMDS and stress 2 of monoMDS
sol2 <- metaMDS(dune, model = "local", stress=2)
sol2
## Use Arrhenius exponent 'z' as a binary dissimilarity measure
sol <- metaMDS(dune, distfun = betadiver, distance = "z")
sol
## IGNORE_RDIFF_END

Run the code above in your browser using DataLab