Learn R Programming

chem16S (version 1.2.0)

get_metric_byrank: Chemical metrics for taxa aggregated to a given rank

Description

Calculates a single chemical metric for taxa in each sample aggregated to a specified rank.

Usage

get_metric_byrank(RDP, map, refdb = "GTDB_220", taxon_AA = NULL,
    groups = NULL, zero_AA = NULL, metric = "Zc", rank = "genus")

Value

A data frame of numeric values with row names corresponding to samples and column names corresponding to taxa.

Arguments

RDP

data frame, taxonomic abundances produced by read_RDP or ps_taxacounts

map

data frame, taxonomic mapping produced by map_taxa

refdb

character, name of reference database (GTDB_220 or RefSeq_206)

taxon_AA

data frame, amino acid compositions of taxa, used to bypass refdb specification

groups

list of indexing vectors, samples to be aggregated into groups

zero_AA

character, three-letter abbreviation(s) of amino acid(s) to assign zero counts for calculating chemical metrics

metric

character, chemical metric to calculate

rank

character, amino acid compositions of all lower-ranking taxa (to genus) are aggregated to this rank

Details

This function adds up amino acid compositions of taxa up to the specified rank and returns a data frame samples on the rows and taxa on the columns. Because amino acid composition for genera have been precomputed from species-level genomes in a reference database, chemical metrics for genera are constant. In contrast, chemical metrics for higher-level taxa is variable as they depend on the reference genomes as well as relative abundances of children taxa.

The value for rank should be one of rootrank, domain, phylum, class, order, family, or genus. For all ranks other than genus, the amino acid compositions of all lower-ranking taxa are weighted by taxonomic abundance and summed in order to calculate the chemical metric at the specified rank. If the rank is genus, then no aggregation is done (because it is lowest-level rank available in the classifications), and the values of the metric for all genera in each sample are returned. If the rank is rootrank, then the results are equivalent to community reference proteomes (i.e., get_metrics).

The RDP, map, refdb, and groups arguments are the same as described in get_metrics. See calc_metrics for available metrics.

References

Dick JM, Shock E. 2013. A metastable equilibrium model for the relative abundances of microbial phyla in a hot spring. PLOS One 8: e72395. tools:::Rd_expr_doi("10.1371/journal.pone.0072395")

See Also

get_metrics

Examples

Run this code
# Plot similar to Fig. 1 in Dick and Shock (2013)
# Read example dataset
RDPfile <- system.file("extdata/RDP-GTDB_220/SMS+12.tab.xz", package = "chem16S")
RDP <- read_RDP(RDPfile)
# Get mapping to reference database
map <- map_taxa(RDP)
# Calculate phylum-level Zc
phylum_Zc <- get_metric_byrank(RDP, map, rank = "phylum")
# Keep phyla present in at least two samples
n_values <- colSums(!sapply(phylum_Zc, is.na))
phylum_Zc <- phylum_Zc[n_values > 2]
# Swap first two samples to get them in the right location
# (MG-RAST accession numbers for these samples are not in spatial order)
phylum_Zc <- phylum_Zc[c(2, 1, 3, 4, 5), ]
matplot(phylum_Zc, type = "b", xlab = "Sampling site (hot -> cool)", ylab = "Zc")
title("Phylum-level Zc at Bison Pool hot spring")

Run the code above in your browser using DataLab