Learn R Programming

phangorn (version 2.8.1)

phyDat: Conversion among Sequence Formats

Description

These functions transform several DNA formats into the phyDat format. allSitePattern generates an alignment of all possible site patterns.

Usage

phyDat(data, type = "DNA", levels = NULL, return.index = TRUE, ...)

as.phyDat(x, ...)

# S3 method for factor as.phyDat(x, ...)

# S3 method for DNAbin as.phyDat(x, ...)

# S3 method for alignment as.phyDat(x, type = "DNA", ...)

phyDat2alignment(x)

# S3 method for MultipleAlignment as.phyDat(x, ...)

# S3 method for phyDat as.MultipleAlignment(x, ...)

acgt2ry(obj)

# S3 method for phyDat as.character(x, allLevels = TRUE, ...)

# S3 method for phyDat as.data.frame(x, ...)

# S3 method for phyDat as.DNAbin(x, ...)

# S3 method for phyDat as.AAbin(x, ...)

baseFreq(obj, freq = FALSE, all = FALSE, drop.unused.levels = FALSE)

# S3 method for phyDat subset(x, subset, select, site.pattern = TRUE, ...)

# S3 method for phyDat [(x, i, j, ..., drop = FALSE)

# S3 method for phyDat unique(x, incomparables = FALSE, identical = TRUE, ...)

removeUndeterminedSites(x, ...)

allSitePattern(n, levels = c("a", "c", "g", "t"), names = NULL)

genlight2phyDat(x, ambiguity = NA)

# S3 method for phyDat image(x, ...)

Arguments

data

An object containing sequences.

type

Type of sequences ("DNA", "AA", "CODON" or "USER").

levels

Level attributes.

return.index

If TRUE returns a index of the site patterns.

...

further arguments passed to or from other methods.

x

An object containing sequences.

obj

as object of class phyDat

allLevels

return original data.

freq

logical, if 'TRUE', frequencies or counts are returned otherwise proportions

all

all a logical; if all = TRUE, all counts of bases, ambiguous codes, missing data, and alignment gaps are returned as defined in the contrast.

drop.unused.levels

logical, drop unused levels

subset

a subset of taxa.

select

a subset of characters.

site.pattern

select site pattern or sites.

i, j

indices of the rows and/or columns to select or to drop. They may be numeric, logical, or character (in the same way than for standard R objects).

drop

for compatibility with the generic (unused).

incomparables

for compatibility with unique.

identical

if TRUE (default) sequences have to be identical, if FALSE sequences are considered duplicates if distance between sequences is zero (happens frequently with ambiguous sites).

n

Number of sequences.

names

Names of sequences.

ambiguity

character for ambiguous character and no contrast is provided.

Value

The functions return an object of class phyDat.

Details

If type "USER" a vector has to be give to levels. For example c("a", "c", "g", "t", "-") would create a data object that can be used in phylogenetic analysis with gaps as fifth state. There is a more detailed example for specifying "USER" defined data formats in the vignette "phangorn-specials".

allSitePattern returns all possible site patterns and can be useful in simulation studies. For further details see the vignette phangorn-specials.

The generic function c can be used to to combine sequences and unique to get all unique sequences or unique haplotypes.

acgt2ry converts a phyDat object of nucleotides into an binary ry-coded dataset.

See Also

DNAbin, as.DNAbin, read.dna, read.aa, read.nexus.data and the chapter 1 in the vignette("phangorn-specials", package="phangorn") and the example of pmlMix for the use of allSitePattern

Examples

Run this code
# NOT RUN {
data(Laurasiatherian)
class(Laurasiatherian)
Laurasiatherian
# base frequencies
baseFreq(Laurasiatherian)
baseFreq(Laurasiatherian, all=TRUE)
baseFreq(Laurasiatherian, freq=TRUE)
# subsetting phyDat objects
# the first 5 sequences
subset(Laurasiatherian, subset=1:5)
# the first 5 characters
subset(Laurasiatherian, select=1:5, site.pattern = FALSE)
# subsetting with []
Laurasiatherian[1:5, 1:20]
# short for
subset(Laurasiatherian, subset=1:5, select=1:20, site.pattern = FALSE)
# the first 5 site patterns (often more than 5 characters)
subset(Laurasiatherian, select=1:5, site.pattern = TRUE)
# transform into old ape format
LauraChar <- as.character(Laurasiatherian)
# and back
Laura <- phyDat(LauraChar)
all.equal(Laurasiatherian, Laura)
# Compute all possible site patterns
# for nucleotides there $4 ^ (number of tips)$ patterns
allSitePattern(5)

# }

Run the code above in your browser using DataLab