Learn R Programming

microcontax (version 1.2)

getDomain: Extractor functions for ConTax data

Description

Extracting taxonomic information from ConTax data sets.

Usage

getDomain(header)
getPhylum(header)
getClass(header)
getOrder(header)
getFamily(header)
getGenus(header)
getTag(header)
getTaxonomy(header)

Arguments

header

A vector of texts, typically the Header from a table, containing taxonomy information in the proper format.

Value

A vector containing the sub-texts extracted from each header text, but getTaxonomy returns a table with the full taxonomy, one row for each input header

Details

The ConTax data sets are tables in the FASTA format (see readFasta), where the Header column contains texts according to a strict format.

The header always starts with a short text, a Tag, which is a unique identifier for every sequence. The function getTag will extract this from the header.

After the Tag follows one or more tokens. One of these tokens must be a string with the following format:

"k__<...>;p__<...>;c__<...>;o__<...>;f__<...>;g__<...>;"

where <...> is some proper text. Here is an example of a proper string:

"k__Bacteria;p__Firmicutes;c__Bacilli;o__Bacillales;f__Staphylococcaceae;g__Staphylococcus;"

The functions getDomain, ..., getGenus extracts the corresponding information from the header. getTaxonomy combines all taxonomy extractors, combines these in a table and imputes missing taxa with parent taxa.

See Also

contax.trim, medoids.

Examples

Run this code
# NOT RUN {
data(contax.trim)
getTag(contax.trim$Header)
getGenus(contax.trim$Header)
getPhylum(contax.trim$Header)

# }

Run the code above in your browser using DataLab