Learn R Programming

phylobase (version 0.8.12)

formatData: Format data for use in phylo4d objects

Description

Associates data with tree nodes and applies consistent formatting rules.

Usage

formatData(
  phy,
  dt,
  type = c("tip", "internal", "all"),
  match.data = TRUE,
  rownamesAsLabels = FALSE,
  label.type = c("rownames", "column"),
  label.column = 1,
  missing.data = c("fail", "warn", "OK"),
  extra.data = c("warn", "OK", "fail"),
  keep.all = TRUE
)

Value

formatData returns a data frame having node numbers as row names. The data frame is also formatted to have the correct dimension given the phylo4 object provided.

Arguments

phy

a valid phylo4 object

dt

a data frame, matrix, vector, or factor

type

type of data to attach

match.data

(logical) should the rownames of the data frame be used to be matched against tip and internal node identifiers? See details.

rownamesAsLabels

(logical), should the row names of the data provided be matched only to labels (TRUE), or should any number-like row names be matched to node numbers (FALSE and default)

label.type

character, rownames or column: should the labels be taken from the row names of dt or from the label.column column of dt?

label.column

if label.type=="column", column specifier (number or name) of the column containing tip labels

missing.data

action to take if there are missing data or if there are data labels that don't match

extra.data

action to take if there are extra data or if there are labels that don't match

keep.all

(logical), should the returned data have rows for all nodes (with NA values for internal rows when type='tip', and vice versa) (TRUE and default) or only rows corresponding to the type argument

Author

Francois Michonneau

Details

formatData is an internal function that should not be called directly by the user. It is used to format data provided by the user before associating it with a tree, and is called internally by the phylo4d, tdata, and addData methods. However, users may pass additional arguments to these methods in order to control how the data are matched to nodes.

Rules for matching rows of data to tree nodes are determined jointly by the match.data and rownamesAsLabels arguments. If match.data is TRUE, data frame rows will be matched exclusively against tip and node labels if rownamesAsLabels is also TRUE, whereas any all-digit row names will be matched against tip and node numbers if rownamesAsLabels is FALSE (the default). If match.data is FALSE, rownamesAsLabels has no effect, and row matching is purely positional with respect to the order returned by nodeId(phy, type).

formatData (1) converts labels provided in the data into node numbers, (2) makes sure that the data are appropriately matched against tip and/or internal nodes, (3) checks for differences between data and tree, (4) creates a data frame with the correct dimensions given a tree.

See Also

the phylo4d-methods constructor, the phylo4d class. See coerce-methods for translation functions.