Learn R Programming

parafac4microbiome

Overview

The parafac4microbiome package enables R users with an easy way to create Parallel Factor Analysis (PARAFAC) models for longitudinal microbiome data.

  • processDataCube() can be used to process the microbiome count data appropriately for a multi-way data array.
  • parafac() allows the user to create a Parallel Factor Analysis model of the multi-way data array.
  • assessModelQuality() helps the user select the appropriate number of components by randomly initializing many PARAFAC models and inspecting various metrics of interest.
  • assessModelStability() helps the user select the appropriate number of components by bootstrapping or jack-knifing samples and inspecting if the model outcome is similar.
  • plotPARAFACmodel() helps visually inspect the PARAFAC model.

This package also comes with three example datasets.

Documentation

A basic introduction to the package is given in vignette("PARAFAC_introduction") and modelling the example datasets are elaborated in their respective vignettes vignette("Fujita2023_analysis"), vignette("Shao2019_analysis") and vignette("vanderPloeg2024_analysis").

These vignettes and all function documentation can be found here.

Installation

The parafac4microbiome package can be installed from CRAN using:

install.packages("parafac4microbiome")

Development version

You can install the development version of parafac4microbiome from GitHub with:

# install.packages("devtools")
devtools::install_github("GRvanderPloeg/parafac4microbiome")

Citation

Please use the following citation when using this package:

  • van der Ploeg, G. R., Westerhuis, J., Heintz-Buschart, A., & Smilde, A. (2024). parafac4microbiome: Exploratory analysis of longitudinal microbiome data using Parallel Factor Analysis. bioRxiv, 2024-05.

Usage

library(parafac4microbiome)
set.seed(123)

# Process the data cube
processedFujita = processDataCube(Fujita2023,
                                  sparsityThreshold=0.99,
                                  CLR=TRUE,
                                  centerMode=1,
                                  scaleMode=2)

# Make a PARAFAC model
model = parafac(processedFujita$data, nfac=3, nstart=10, output="best", verbose=FALSE)

# Sign flip components to make figure interpretable and comparable to the paper.
# This has no effect on the model or the fit.
model$Fac[[1]][,2] = -1 * model$Fac[[1]][,2] # sign flip mode 1 component 2
model$Fac[[2]][,1] = -1 * model$Fac[[2]][,1] # sign flip mode 2 component 1
model$Fac[[2]][,3] = -1 * model$Fac[[2]][,3] # sign flip mode 2 component 3
model$Fac[[3]] = -1 * model$Fac[[3]]         # sign flip all of mode 3

# Plot the PARAFAC model using some metadata
plotPARAFACmodel(model$Fac, processedFujita,
                 numComponents = 3,
                 colourCols = c("", "Genus", ""),
                 legendTitles = c("", "Genus", ""),
                 xLabels = c("Replicate", "Feature index", "Time point"),
                 legendColNums = c(0,5,0),
                 arrangeModes = c(FALSE, TRUE, FALSE),
                 continuousModes = c(FALSE,FALSE,TRUE),
                 overallTitle = "Fujita PARAFAC model")

Getting help

If you encounter an unexpected error or a clear bug, please file an issue with a minimal reproducible example here on Github. For questions or other types of feedback, feel free to send an email.

Copy Link

Version

Install

install.packages('parafac4microbiome')

Monthly Downloads

260

Version

1.1.2

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Geert Roelof van der Ploeg

Last Published

March 22nd, 2025

Functions in parafac4microbiome (1.1.2)

importPhyloseq

Import Phyloseq object for PARAFAC modelling
initializePARAFAC

Initialize PARAFAC algorithm input vectors
importTreeSummarizedExperiment

Import TreeSummarizedExperiment object for PARAFAC modelling
multiwayScale

Scale a multi-way array
parafac

Parallel Factor Analysis
parafac_gradient

Calculate gradient of PARAFAC model.
multiwayCLR

Perform a centered log-ratio transform over a multi-way array
plotPARAFACmodel

Plot a PARAFAC model
plotModelStability

Plot a summary of the loadings of many initialized parafac models.
plotModelMetric

Plot diagnostics of many initialized PARAFAC models.
plotModelTCCs

Plots Tucker Congruence Coefficients of randomly initialized models.
transformPARAFACloadings

Transform PARAFAC loadings to an orthonormal basis. Note: this function only works for 3-way PARAFAC models.
vanderPloeg2024

vanderPloeg2024 longitudinal dataset
reinflateTensor

Create a tensor out of a set of matrices similar to a component model.
processDataCube

Process a multi-way array of count data.
reinflateFac

Calculate Xhat from a model Fac object
vect_to_fac

Convert vectorized output of PARAFAC to a Fac list object with all loadings per mode.
%>%

Pipe operator
sortComponents

Sort PARAFAC components based on variance explained per component.
parafac_fun

PARAFAC loss function calculation
parafac_core_als

Internal PARAFAC alternating least-squares (ALS) core algorithm
assessModelQuality

Create randomly initialized models to determine the correct number of components by assessing model quality metrics.
assessModelStability

Bootstrapping procedure to determine PARAFAC model stability for a given number of components.
Shao2019

Shao2019 longitudinal microbiome data
calculateFMS

Calculate Factor Match Score for all initialized models.
Fujita2023

Fujita2023 longitudinal microbiome data
fac_to_vect

Vectorize Fac object
calcVarExpPerComponent

Calculate the variance explained of a PARAFAC model, per component
corcondia

Core Consistency Diagnostic (CORCONDIA) calculation
calculateSparsity

Calculate sparsity across the feature mode of a multi-way array.
calculateVarExp

Calculate the variation explained by a PARAFAC model.
flipLoadings

Sign flip the loadings of many randomly initialized models to make consistent overview plots.
importMicrobiotaProcess

Import MicrobiotaProcess object for PARAFAC modelling
multiwayCenter

Center a multi-way array
parafac4microbiome-package

parafac4microbiome: Parallel Factor Analysis Modelling of Longitudinal Microbiome Data