Learn R Programming

rbiom (version 2.1.2)

bdiv_table: Distance / dissimilarity between samples.

Description

Distance / dissimilarity between samples.

Usage

bdiv_table(
  biom,
  bdiv = "Bray-Curtis",
  weighted = TRUE,
  normalized = TRUE,
  tree = NULL,
  md = ".all",
  within = NULL,
  between = NULL,
  delta = ".all",
  transform = "none",
  ties = "random",
  seed = 0,
  cpus = NULL
)

bdiv_matrix( biom, bdiv = "Bray-Curtis", weighted = TRUE, normalized = TRUE, tree = NULL, within = NULL, between = NULL, transform = "none", ties = "random", seed = 0, cpus = NULL, underscores = FALSE )

bdiv_distmat( biom, bdiv = "Bray-Curtis", weighted = TRUE, normalized = TRUE, tree = NULL, within = NULL, between = NULL, transform = "none", cpus = NULL )

Value

bdiv_matrix() -

An R matrix of samples x samples.

bdiv_distmat() -

A dist-class distance matrix.

bdiv_table() -

A tibble data.frame with columns names .sample1, .sample2, .weighted, .bdiv, .distance, and any fields requested by md. Numeric metadata fields will be returned as abs(x - y); categorical metadata fields as "x", "y", or "x vs y".

Arguments

biom

An rbiom object, such as from as_rbiom(). Any value accepted by as_rbiom() can also be given here.

bdiv

Beta diversity distance algorithm(s) to use. Options are: "Bray-Curtis", "Manhattan", "Euclidean", "Jaccard", and "UniFrac". For "UniFrac", a phylogenetic tree must be present in biom or explicitly provided via tree=. Multiple/abbreviated values allowed. Default: "Bray-Curtis"

weighted

Take relative abundances into account. When weighted=FALSE, only presence/absence is considered. Multiple values allowed. Default: TRUE

normalized

Only changes the "Weighted UniFrac" calculation. Divides result by the total branch weights. Default: TRUE

tree

A phylo object representing the phylogenetic relationships of the taxa in biom. Only required when computing UniFrac distances. Default: biom$tree

md

Dataset field(s) to include in the output data frame, or '.all' to include all metadata fields. Default: '.all'

within, between

Dataset field(s) for intra- or inter- sample comparisons. Alternatively, dataset field names given elsewhere can be prefixed with '==' or '!=' to assign them to within or between, respectively. Default: NULL

delta

For numeric metadata, report the absolute difference in values for the two samples, for instance 2 instead of "10 vs 12". Default: TRUE

transform

Transformation to apply. Options are: c("none", "rank", "log", "log1p", "sqrt", "percent"). "rank" is useful for correcting for non-normally distributions before applying regression statistics. Default: "none"

ties

When transform="rank", how to rank identical values. Options are: c("average", "first", "last", "random", "max", "min"). See rank() for details. Default: "random"

seed

Random seed for permutations. Must be a non-negative integer. Default: 0

cpus

The number of CPUs to use. Set to NULL to use all available, or to 1 to disable parallel processing. Default: NULL

underscores

When parsing the tree, should underscores be kept as is? By default they will be converted to spaces (unless the entire ID is quoted). Default FALSE

Metadata Comparisons

Prefix metadata fields with == or != to limit comparisons to within or between groups, respectively. For example, stat.by = '==Sex' will run calculations only for intra-group comparisons, returning "Male" and "Female", but NOT "Female vs Male". Similarly, setting stat.by = '!=Body Site' will only show the inter-group comparisons, such as "Saliva vs Stool", "Anterior nares vs Buccal mucosa", and so on.

The same effect can be achieved by using the within and between parameters. stat.by = '==Sex' is equivalent to stat.by = 'Sex', within = 'Sex'.

See Also

Other beta_diversity: bdiv_boxplot(), bdiv_clusters(), bdiv_corrplot(), bdiv_heatmap(), bdiv_ord_plot(), bdiv_ord_table(), bdiv_stats(), distmat_stats()

Examples

Run this code
    library(rbiom)
    
    # Subset to four samples
    biom <- hmp50$clone()
    biom$counts <- biom$counts[,c("HMP18", "HMP19", "HMP20", "HMP21")]
    
    # Return in long format with metadata
    bdiv_table(biom, 'unifrac', md = ".all")
    
    # Only look at distances among the stool samples
    bdiv_table(biom, 'unifrac', md = c("==Body Site", "Sex"))
    
    # Or between males and females
    bdiv_table(biom, 'unifrac', md = c("Body Site", "!=Sex"))
    
    # All-vs-all matrix
    bdiv_matrix(biom, 'unifrac')
    
    # All-vs-all distance matrix
    dm <- bdiv_distmat(biom, 'unifrac')
    dm
    plot(hclust(dm))

Run the code above in your browser using DataLab