TRAMP: TRFLP Analysis and Matching Program

Description

Determine if TRFLP profiles may match those in a database of knowns. The resulting object can be used to produce a presence/absence matrix of known profiles in environmental samples.

The TRAMPR package contains a vignette, which includes a worked example; type vignette("TRAMPRdemo") to view it.

Usage

TRAMP(samples, knowns, accept.error=1.5, min.comb=4, method="maximum")

Arguments

samples

A TRAMPsamples object, containing unidentified samples.

knowns

A TRAMPknowns object, containing identified TRFLP patterns.

accept.error

The largest acceptable difference (in base pairs) between any peak in the sample data and the knowns database (see Details; interpretation will depend on the value of method).

min.comb

Minimum number of enzyme/primer combinations required before presence will be tested. The default (4) should be reasonable in most cases. Setting min.comb to NA will require that all enzyme/primer combinations in the knowns database are present in the samples.

method

Method used in calculating the difference between samples and knowns; may be one of "maximum", "euclidian" or "manhattan" (or any unambiguous abbreviation).

Value

A TRAMP object, with elements:

presence

Presence/absence matrix. Rows are different samples (with rownames from labels(samples)) and columns are different knowns (with colnames from labels(knowns)). Do not access the presence/absence matrix directly, but use summary.TRAMP, which provides options for labelling knowns, grouping knowns, and excluding “ignored” matches.

error

Matrix of distances between the samples and known, calculated by one of the methods described above. Rows correspond to different samples, and columns correspond to different knowns. The matrix dimension names are set to the values sample.pk and knowns.pk for the samples and knowns, respectively.

A two-dimensional matrix (same dimensions as error), recording the number of enzyme/primer combinations present for each combination of samples and knowns.

diffsmatrix

Three-dimensional array of output from create.diffsmatrix.

enzyme.primer

Different enzyme/primer combinations present in the data, in the order of the third dimension of diffsmatrix (see create.diffsmatrix for details).

samples, knowns, accept.error, min.comb, method

The input data objects and arguments, unmodified.

In addition, an element presence.ign is included to allow matches to be ignored. However, this interface is experimental and its current format should not be relied on - use remove.TRAMP.match rather than interacting directly with presence.ign.

Matching is based only on peak size (in base pairs), and does not consider peak heights.

Details

TRAMP attempts to determine which species in the ‘knowns’ database may be present in a collection of samples.

A sample matches a known if it has a peak that is “close enough” to every peak in the known for every enzyme/primer combination that they share. The default is to accept matches where the largest distance between a peak in the knowns database and the sample is less than accept.error base pairs (default 2), and where at least min.comb enzyme/primer combinations are shared between a sample and a known (default 4).

The three-dimensional matrix of match errors is generated by create.diffsmatrix. In the resulting array, m[i,j,k] is the difference (in base pairs) between the ith sample and the jth known for the kth enzyme/primer combination.

If $p_k$ and $q_k$ are the sizes of peaks for the $k$th enzyme/primer combination for a sample and known (respectively), then maximum distance is defined as

$$\max(|p_k - q_k|)$$

Euclidian distance is defined as $$\frac{1}{n}\sqrt{\sum (p_k - q_k)^2}$$

and Manhattan distance is defined as $$\frac{1}{n}\sum{|p_k - q_k|}$$

where $n$ is the number of shared enzyme/primer combinations, since this may vary across sample/known combinations. For Euclidian and Manhattan distances, accept.error then becomes the mean distance, rather than the total distance.

Examples

Run this code

# NOT RUN {
data(demo.knowns)
data(demo.samples)

res <- TRAMP(demo.samples, demo.knowns)

## The resulting object can be interrogated with methods:

## The goodness of fit of the sample with sample.pk=101 (see
## ?\link{plot.TRAMP}).
plot(res, 101)

# }
# NOT RUN {
## To see all plots (this produces many figures), one after another.
op <- par(ask=TRUE)
plot(res)
par(op)
# }
# NOT RUN {
## Produce a presence/absence matrix (see ?\link{summary.TRAMP}).
m <- summary(res)
head(m)
# }

Run the code above in your browser using DataLab