bdiv_table: Distance / dissimilarity between samples.

Description

Distance / dissimilarity between samples.

Usage

bdiv_table(
  biom,
  bdiv = "Bray-Curtis",
  weighted = TRUE,
  normalized = TRUE,
  tree = NULL,
  md = ".all",
  within = NULL,
  between = NULL,
  delta = ".all",
  transform = "none",
  ties = "random",
  seed = 0,
  cpus = NULL
)
bdiv_matrix(
  biom,
  bdiv = "Bray-Curtis",
  weighted = TRUE,
  normalized = TRUE,
  tree = NULL,
  within = NULL,
  between = NULL,
  transform = "none",
  ties = "random",
  seed = 0,
  cpus = NULL,
  underscores = FALSE
)
bdiv_distmat(
  biom,
  bdiv = "Bray-Curtis",
  weighted = TRUE,
  normalized = TRUE,
  tree = NULL,
  within = NULL,
  between = NULL,
  transform = "none",
  cpus = NULL
)

Value

bdiv_matrix() -: An R matrix of samples x samples.

bdiv_distmat() -

A dist-class distance matrix.

bdiv_table() -

A tibble data.frame with columns names .sample1, .sample2, .weighted, .bdiv, .distance, and any fields requested by md. Numeric metadata fields will be returned as abs(x - y); categorical metadata fields as "x", "y", or "x vs y".

Arguments

biom: An rbiom object, such as from as_rbiom(). Any value accepted by as_rbiom() can also be given here.
bdiv: Beta diversity distance algorithm(s) to use. Options are: "Bray-Curtis", "Manhattan", "Euclidean", "Jaccard", and "UniFrac". For "UniFrac", a phylogenetic tree must be present in biom or explicitly provided via tree=. Multiple/abbreviated values allowed. Default: "Bray-Curtis"
weighted: Take relative abundances into account. When weighted=FALSE, only presence/absence is considered. Multiple values allowed. Default: TRUE
normalized: Only changes the "Weighted UniFrac" calculation. Divides result by the total branch weights. Default: TRUE
tree: A phylo object representing the phylogenetic relationships of the taxa in biom. Only required when computing UniFrac distances. Default: biom$tree
md: Dataset field(s) to include in the output data frame, or '.all' to include all metadata fields. Default: '.all'
within, between: Dataset field(s) for intra- or inter- sample comparisons. Alternatively, dataset field names given elsewhere can be prefixed with '==' or '!=' to assign them to within or between, respectively. Default: NULL
delta: For numeric metadata, report the absolute difference in values for the two samples, for instance 2 instead of "10 vs 12". Default: TRUE
transform: Transformation to apply. Options are: c("none", "rank", "log", "log1p", "sqrt", "percent"). "rank" is useful for correcting for non-normally distributions before applying regression statistics. Default: "none"
ties: When transform="rank", how to rank identical values. Options are: c("average", "first", "last", "random", "max", "min"). See rank() for details. Default: "random"
seed: Random seed for permutations. Must be a non-negative integer. Default: 0
cpus: The number of CPUs to use. Set to NULL to use all available, or to 1 to disable parallel processing. Default: NULL
underscores: When parsing the tree, should underscores be kept as is? By default they will be converted to spaces (unless the entire ID is quoted). Default FALSE

Metadata Comparisons

Prefix metadata fields with == or != to limit comparisons to within or between groups, respectively. For example, stat.by = '==Sex' will run calculations only for intra-group comparisons, returning "Male" and "Female", but NOT "Female vs Male". Similarly, setting stat.by = '!=Body Site' will only show the inter-group comparisons, such as "Saliva vs Stool", "Anterior nares vs Buccal mucosa", and so on.

The same effect can be achieved by using the within and between parameters. stat.by = '==Sex' is equivalent to stat.by = 'Sex', within = 'Sex'.

Examples

Run this code

    library(rbiom)
    
    # Subset to four samples
    biom <- hmp50$clone()
    biom$counts <- biom$counts[,c("HMP18", "HMP19", "HMP20", "HMP21")]
    
    # Return in long format with metadata
    bdiv_table(biom, 'unifrac', md = ".all")
    
    # Only look at distances among the stool samples
    bdiv_table(biom, 'unifrac', md = c("==Body Site", "Sex"))
    
    # Or between males and females
    bdiv_table(biom, 'unifrac', md = c("Body Site", "!=Sex"))
    
    # All-vs-all matrix
    bdiv_matrix(biom, 'unifrac')
    
    # All-vs-all distance matrix
    dm <- bdiv_distmat(biom, 'unifrac')
    dm
    plot(hclust(dm))