Learn R Programming

GCalignR

GCalignR provides simple functions to align peak lists obtained from Gas Chromatography Flame Ionization Detectors (GC-FID) based on retention times and plots to evaluate the quality of the alignment. The package supports any other one-dimensional chromatography technique that enables the user to create a peak list with at least one column specifying retention times as illustrated below.

As with other software you need to get used to the input format which is shown in the illustration:

  • Row 1: Sample names
  • Row 2: Variable names
  • Row 3-N: GC data
    • Each block belongs to a sample as shown for sample A (green) and sample B (orange) above

Installing GCalignR:

The latest release v1.0.6 is on CRAN. Click here for an overview of past releases and a brief description of applied changes.

install.packages("GCalignR", dependencies = T)

The current developmental version is identical to the CRAN release

if (!("devtools" %in% rownames(installed.packages()))) { 
    install.packages("devtools")
} else if (packageVersion("devtools") < 1.6) {
    install.packages("devtools")
}
devtools::install_github("mottensmann/GCalignR", build_vignettes = TRUE)

Get started with GCalignR

To get started read the vignettes:

browseVignettes("GCalignR")

Basic usage of the main function to align peaks:

  • data: Path to a text file (see input format above), or list of data frames, each corresponding to a sample
  • rt_col_name: column name of retention time values
  • max_linear_shift: Here, no adjustment of systematic linear drift
  • max_diff_peak2mean: Here, sort all peaks strictly by retention time
  • min_diff_peak2peak: Here, try to merge peaks when rt differs by less than 0.1
library(GCalignR)
#> Warning: package 'GCalignR' was built under R version 4.4.1
aligned <- align_chromatograms(data = peak_data[1:4], # list of data frame 
                               rt_col_name = "time", # retention time
                               max_linear_shift = 0, #
                               max_diff_peak2mean = 0, 
                               min_diff_peak2peak = 0.08) 
#> Run GCalignR
#> Start: 2024-07-03 14:53:38
#> 
#> Data for 4 samples loaded.
#> No reference was specified. Hence, a reference will be selected automatically ...
#>  
#> 'C2' was selected on the basis of highest average similarity to all samples (score = 0.06).
#> Start correcting linear shifts with "C2" as a reference ...
#> 
#> Start aligning peaks ...  this might take a while!
#> 
#> Merge redundant rows ...
#>  
#> Alignment completed!
#> Time: 2024-07-03 14:53:41

The parameter values above differ from the defaults shown in the paper and the package vignette. In a nutshell, we now suggest in most cases to set max_diff_peak2mean = 0. This way peaks are first simply sorted based on the given retention time value and then purely min_diff_peak2peak specifies which peaks will be evaluated for a merge. Additionally, this enables the possibility for a considerable boost in computation speed of the first alignment steps (available since version 1.0.5, currently only on GitHub!)

If you encounter bugs or if you have any suggestions for improvement (for instance on how to speed up the algorithm!), just contact meinolf.ottensmann[at]web.de

Also I´m happy to provide help if you can´t get it to work. Usually it is easy to solve small problems. However, in order to simplify this process please send a short description of the problem along with the code you have been using as a script file (.R) together with a minimal example input file (.txt).

Published paper

Ottensmann M, Stoffel MA, Nichols HJ, Hoffman JI (2018) GCalignR: An R package for aligning gas-chromatography data for ecological and evolutionary studies. PLoS ONE 13(6): e0198311. https://doi.org/10.1371/journal.pone.0198311

#> Warning: package 'ggplot2' was built under R version 4.4.1

Copy Link

Version

Install

install.packages('GCalignR')

Monthly Downloads

272

Version

1.0.7

License

GPL (>= 2) | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Meinolf Ottensmann

Last Published

July 3rd, 2024

Functions in GCalignR (1.0.7)

peak_interspace

Estimate the observed space between peaks within chromatograms
remove_blanks

Remove peaks present in negative control samples
read_empower2

Import data from single EMPOWER2 HPLC files
peak_data

Gas-chromatography data for Antarctic Fur Seals (Arctocephalus gazella)
simple_chroma

Simulate simple chromatograms
remove_singletons

Remove singletons
read_peak_list

Read content of a text file and convert it to a list
aligned_peak_data

Aligned Gas-Chromatography data
as.data.frame.GCalign

Output aligned data in form of a data frame for each variable
check_input

Check input prior to processing in GCalignR
GCalignR

GCalignR: A Package to Align Gas Chromatography Peaks Based on Retention Times
align_chromatograms

Aligning peaks based on retention times
align_peaks

align peaks individually among chromatograms
peak_factors

Grouping factors corresponding to gas-chromatography data of Antarctic Fur Seals (Arctocephalus gazella)
align_peaks_fast

align peaks individually among chromatograms
blank_substraction

Subtraction of blank readings from sample readings
choose_optimal_reference

Select the optimal reference for full alignments of peak lists
norm_peaks

Normalisation of peak abundancies
plot.GCalign

Plot diagnostics for an GCalign Object
print.GCalign

Summarising Peak Alignments with GCalignR
draw_chromatogram

Visualise peak lists as a pseudo-chromatogram
find_peaks

Detect local maxima in time series
gc_heatmap

Visualises peak alignments in form of a heatmap
linear_transformation

Full Alignment of Peak Lists by linear retention time correction.
merge_redundant_rows

Merge redundant rows