Learn R Programming

PTXQC

This package allows users of MaxQuant (from .txt files) and OpenMS (from mzTab files) to generate quality control reports in Html/PDF format.

Latest changes / ChangeLog

latest Release: v1.1.1 - Mar 2024 latest Release on CRAN: same

See NEWS file for a version history.

Platform support

  • Windows (recommended for convenience to make use of the drag'n'drop batch file provided)
  • Linux
  • MacOSX

Citation

Please cite PTXQC when using it to check data in your publications:

Proteomics Quality Control: Quality Control Software for MaxQuant Results Chris Bielow, Guido Mastrobuoni, and Stefan Kempa J. Proteome Res., 2016, 15 (3), pp 777-787. DOI: 10.1021/acs.jproteome.5b00780

Features

  • plethora of quality metrics
    • intensity distributions
    • digestion efficiency
    • contaminant visualizations
    • identification performance
    • Match-between-runs performance
  • easy usage ([Windows OS only] drag'n'drop your txt output folder onto a batch file)
  • MaxQuant locale check, i.e. dot as decimal separator (since PTXQC 1.0.10; see https://github.com/cbielow/PTXQC/pull/99 for details)
  • Html/PDF report will be generated within your MaxQuant-txt folder or next to the mzTab file
  • writes a mockup mzQC file (https://github.com/HUPO-PSI/mzQC/) for archiving or downstream processing (actual metrics still require exporting; PR's welcome)
  • optional configuration file in YAML format for generation of shorter/customized reports

Target audience

  • MaxQuant users (no knowledge of R required)
  • OpenMS users (or any other software which can write an mzTab)
  • bioinformaticians (who want to contribute or customize)

Documentation

A short overview video on PTX-QC] can be found here. If you want to jump to certain sections:

We use pgkdown to create a HTML documentation, which includes the Vignettes, our function documentation etc. You can create the documentation locally: pkgdown::build_site() or visit the online-version on Github at ./docs/index.html.

If you do not know where to start, look at the package vignettes first.

Our Vignettes give details on:

  • Full List of Quality Metrics with help text ('List of Metrics')
  • Input and Output
  • Report customization
  • (for MaxQuant/OpenMS users) Usage of Drag'n'drop
  • (for R users) Code examples in R

The 'List of Metrics' vignette contains a full description for each metric (the same as seen in the Help section of each Html report).

Within R, you can browse the vignettes using either of these commands (after the package is installed (see below)):

    help(package="PTXQC")
    browseVignettes(package = 'PTXQC')

Of course, you can also look at the Vignettes on CRAN: latest online vignette at CRAN

Installation

You can use PTX-QC without installing it, by using our webserver: Visit ptxqc.bsc.fu-berlin.de and simply upload your data. This service is not suited for large-scale data analysis, but should be fine for the occasional analysis.

If you want to generate QC reports without actually getting involved in R:

We offer a Batch-file based Drag'n'drop mechanism to trigger PTXQC on any MaxQuant output folder. This only works for Windows (not Linux or MacOS) at the moment -- but you have a Windows anyway to run MaxQuant, right?! See drag'n'drop for details. It takes 10 minutes and you are done!

If you just want to use PTXQC (and maybe even modify) it:

First, install pandoc (see bottom of linked page). Pandoc is required in order to locally build the package vignettes (documentation), but you can also read the vignettes online from the PTXQC GitHub page. More importantly, Pandoc enables PTXQC to write QC reports in HTML format (which come with a help text for each plot and are interactive). PDF reports only contain plots! The reports are printed as PDF by default and additionally as HTML if Pandoc is found. If you install Pandoc later while your R session is already open, you need to close and re-open R in order to make R aware of Pandoc!

You can grab PTXQC from either CRAN or GitHub. GitHub installation will give you the latest package; the CRAN version might be a little older, but is faster to install. Check the NEWS file for CRAN submissions and version.

For the code blocks below: Run each line separately in your R console, i.e. do not copy and paste the whole block. If an error should occur, this allows to track it down more easily. See FAQ - Installation how to resolve them.

## CRAN
install.packages("PTXQC")

or

## GitHub
if (!require(devtools, quietly = TRUE)) install.packages("devtools")
library("devtools")             ## this might give a warning like 'WARNING: Rtools is required ...'. Ignore it.

## use build_vignettes = FALSE if you did not install pandoc or if you encounter errors when building vignettes (e.g. PRIDE ftp unavailable)!
install_github("cbielow/PTXQC", build_vignettes = TRUE, dependencies = TRUE)

To get started, see the help and/or vignettes:

help(package="PTXQC")
browseVignettes(package = 'PTXQC')

Please feel free to report bugs (see below), or issue pull requests!

Report Examples

An overview chart at the beginning of the report will give you a first impression. Detailed plots can be found in the remainder of each report.

For example input data and full reports, see the 'inst/examples' subfolder.

Bug reporting / Feature requests

If you encounter a bug, e.g. error message, wrong figures, missing axis annotation or anything which looks suspicious, please use the GitHub issue tracker and file a report.

You should include

  • stage you encounter the bug, e.g. during installation, report creation, or after report creation (i.e. a bug in the report itself).
  • PDF/Html report itself (if one was generated).
  • version of PTXQC, e.g. see the report_XXX.pdf/html (where XXX will be the version) or see the DESCRIPTION file of the PTXQC package or call help(package="PTXQC") within R
  • error message (very important!). Either copy it or provide a screen shot.

Please be as precise as possible when providing the bug report - just imagine what kind of information you would like to have in order to track down the issue. In certain situations, the whole txt-folder or a single MaxQuant/mzTab file might be helpful to solve the problem.

Contributing - Get Involved!

We welcome input from our user base! PTX-QC has a very permissive BSD-3 clause License (see DESCRIPTION file), so feel free to fork, patch and contribute! There are many ways to get involved, you do not need to be a developer!

  • suggest a new metric (and why you think it's useful) by opening a new ticket here on GitHub.
  • suggest changes to existing metrics (improvements or bugfixes), see above.
  • suggest improvements to our documentation (e.g. additional vignettes)
  • write code (in R) and submit a Pull Request (PR).

PTX-QC user forum

Come and interact with developers and users at https://discord.gg/MB6PvpctUY

Misc

Use PTXQC v0.69.3 if you want the version which was used in the paper, i.e. use install_github(..., ref="v0.69.3") when following the Installation procedure.

The input data used in the original publication is available in the 'inst/examples' subfolder.

We recommend to use the most recent PTXQC for the best user experience.

Copy Link

Version

Install

install.packages('PTXQC')

Monthly Downloads

654

Version

1.1.2

License

BSD_3_clause + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Chris Bielow

Last Published

January 9th, 2025

Functions in PTXQC (1.1.2)

brewer.pal.Safe

Return color brew palettes, but fail hard if number of requested colors is larger than the palette is holding.
assignBlocks

Assign set numbers to a vector of values.
appendEnv

Add the value of a variable to an environment (fast append)
del0

Replace 0 with NA in a vector
correctSetSize

Re-estimate a new set size to split a number of items into equally sized sets.
delLCP

Removes the longest common prefix (LCP) from a vector of strings.
createYaml

Creates a yaml file storing the parameters that are used for creating the PTXQC report and returns these parameters as well as a list of available qc-Metrics objects.
computeMatchRTFractions

Combine several data structs into a final picture for segmentation incurred by 'Match-between-runs'.
checkEnglishLocale

When MaxQuant is run with a wrong locale (i.e. the decimal separator is not a '.', but a ','), then MaxQuant results are plainly wrong and broken. The can be detected by, e.g. checking for negative charge annotation
createReport

Create a quality control report (in PDF format).
darken

Make a color (given as name or in RGB) darker by factor x = [0 = black, 1=unchanged]
delLCS

Removes the longest common suffix (LCS) from a vector of strings.
findAlignReference

Return list of raw file names which were reported by MaxQuant as reference point for alignment.
getMQPARValue

Retrieve a parameter value from a mqpar.xml file
getHTMLTable

Create an HTML table with an extra header row
getFileEncoding

Determine if a file is 'UTF-8' or 'UTF-8-BOM' (as of MQ2.4) or 'UTF-16BE' or 'UTF-16LE'
getAbundanceClass

Assign a relative abundance class to a set of (log10) abundance values
flattenList

Flatten lists of lists with irregular depths to just a list of items, i.e. a list of the leaves (if you consider the input as a tree).
getFragmentErrors

Extract fragment mass deviation errors from a data.frame from msms.txt
fixCalibration

Detect (and fix) MaxQuant mass recalibration columns, since they sometimes report wrong values.
getECDF

Estimate the empirical density and return it
getMaxima

Find the local maxima in a vector of numbers.
ggAxisLabels

Function to thin out the number of labels shown on an axis in GGplot
ggText

Plot a text as graphic using ggplot2.
getMetaFilenames

Parses the given mqpar.xml file (or, if not found, tries the 'txt_folder' + '/../../' folder (i.e. where the raw data should be)) to extract the full filepaths for all Raw files
getPCA

Create a principal component analysis (PCA) plot for the first two dimensions.
getPeptideCounts

Extract the number of peptides observed per Raw file from an evidence table.
getMetricsObjects

Get all currently available metrics
lcpCount

Count the number of chars of the longest common prefix
lcsCount

Count the number of chars of the longest common suffix
grepv

Grep with values returned instead of indices.
plot_CalibratedMSErr

Plot bargraph of uncalibrated mass errors for each Raw file.
mosaicize

Prepare a Mosaic plot of two columns in long format.
pasten

paste with newline as separator
%+%

A string concatenation function, more readable than 'paste()'.
getQCHeatMap

Generate a Heatmap from a list of QC measurements.
inMatchWindow

For grouped peaks: separate them into in-width vs. out-width class.
getMetaData

Extract meta information (orderNr, metric name, category) from a list of Qc metric objects
getProteinCounts

Extract the number of protein groups observed per Raw file from an evidence table.
idTransferCheck

Check how close transferred ID's after alignment are to their genuine IDs within one Raw file.
plotTableRaw

Colored table plot.
modsToTable

Convert list of (mixed)modifications to a frequency table
plot_ContEVD

Plot contaminants from evidence.txt, broken down into top5-proteins.
modsToTableByRaw

Convert list of (mixed)modifications to a frequency table
plot_Charge

The plots shows the charge distribution per Raw file. The output of 'mosaicize()' can be used directly.
plot_IonInjectionTimeOverRT

Plot line graph of TopN over Retention time.
plot_RTPeakWidth

Plot RT peak width over time
plot_IDsOverRT

Plot IDs over time for each Raw file.
plot_DataOverRT

Plot some count data over time for each Raw file.
plot_IDRate

Plot percent of identified MS/MS for each Raw file.
plot_RatiosPG

Plot ratios of labeled data (e.g. SILAC) from proteinGroups.txt
plot_TopNoverRT

Plot line graph of TopN over Retention time.
plot_TopN

Plot line graph of TopN over Retention time.
plot_ContUserScore

Plot Andromeda score distribution of contaminant peptide vs. matrix peptides.
pastet

paste with tab as separator
peakSegmentation

Determine fraction of evidence which causes segmentation, i.e. sibling peaks at different RTs confirmed either by genuine or transferred MS/MS.
plot_ContUser

Plot user-defined contaminants from evidence.txt
qcMetric-class

Class which can compute plots and generate mzQC output (usually for a single metric).
printWithFooter

Augment a ggplot with footer text
plot_MBRgain

Plot MaxQuant Match-between-runs id transfer performance as a scatterplot.
plot_peptideMods

Plot peptide modification frequencies
plot_UncalibratedMSErr

A boxplot of uncalibrated mass errors for each Raw file.
getRunQualityTemplate

Get an mzQC runQuality without actual metrics, but with full metadata
getReportFilenames

Assembles a list of output file names, which will be created during reporting.
longestCommonPrefix

Get the longest common prefix from a set of strings.
plot_MBRIDtransfer

Plot MaxQuant Match-between-runs id transfer performance.
plot_MBRAlign

Plot MaxQuant Match-between-runs alignment performance.
qcMetric_MSMSScans_TopNoverRT-class

Metric for msmsscans.txt, showing TopN over RT.
plot_MS2Decal

Plot bargraph of oversampled 3D-peaks.
plot_MS2Oversampling

Plot bargraph of oversampled 3D-peaks.
peakWidthOverTime

Discretize RT peak widths by averaging values per time bin.
longestCommonSuffix

Like longestCommonPrefix(), but on the suffix.
qualMedianDist

Quality metric which measures the absolute distance from median.
qualLinThresh

Quality metric with linear response to input, reaching the maximum score at the given threshold.
plotTable

Plot a table with row names and title
theme_blank

A blank theme (similar to the deprecated theme_blank())
qualBestKS

From a list of vectors, compute all vs. all Kolmogorov-Smirnoff distance statistics (D)
thinOut

Thin out a data.frame by removing rows with similar numerical values in a certain column.
plot_ContsPG

Plot contaminants from proteinGroups.txt
plot_ScanIDRate

Plot line graph of TopN over Retention time.
plot_CountData

Plot Protein groups per Raw file
pointsPutX

Distribute a set of points with fixed y-values on a stretch of the x-axis.
qualUniform

Compute deviation from uniform distribution
read.MQ

Convenience wrapper for MQDataReader when only a single MQ file should be read and file mapping need not be stored.
qualCentered

Quality metric for 'centeredness' of a distribution around zero.
plot_TIC

Plot Total Ion Count over time
qualCenteredRef

Quality metric for 'centeredness' of a distribution around zero with a user-supplied range threshold.
simplifyNames

Removes common substrings (infixes) in a set of strings.
renameFile

Given a vector of (short/long) filenames, translate to the (long/short) version
plot_MissedCleavages

Plot bargraph of missed cleavages.
qualHighest

Score an empirical density distribution of values, where the best possible distribution is right-skewed.
qualGaussDev

Compute probability of Gaussian (mu=m, sd=s) at a position 0, with reference to the max obtainable probability of that Gaussian at its center.
repEach

Repeat each element x_i in X, n_i times.
scale01linear

Scales a vector of values linearly to [0, 1] If all input values are equal, returned values are all 0
print.PTXQC_table

helper S3 class, enabling print(some-plot_Table-object)
scale_x_discrete_reverse

Inverse the order of items on the x-axis (for discrete scales)
shortenStrings

Shorten a string to a maximum length and indicate shorting by appending '..'
scale_y_discrete_reverse

Inverse the order of items on the y-axis (for discrete scales)
supCount

Compute shortest prefix length which makes all strings in a vector uniquely identifyable.
wait_for_writable

Check if a file is writable and blocks an interactive session, waiting for user input.
thinOutBatch

Apply 'thinOut' on all subsets of a data.frame, split by a batch column
CV

Coefficient of variation (CV)
LCS

Compute longest common substring of two strings.
MzTabReader-class

Class to read an mzTab file and store the tables internally.
RTalignmentTree

Return a tree plot with a possible alignment tree.
QCMetaFilenames

Define a Singleton class which holds the full raw filenames (+path) and their PSI-MS CV terms for usage in the mzQC metadata
FilenameMapper-class

Make sure to call $readMappingFile(some_file) if you want to support a user-defined file mapping. Otherwise, calls to $getShortNames() will create/augment the mapping for filenames.
PTXQC-package

PTXQC: A package for computing Quality Control (QC) metrics for Proteomics (PTX)
LCSn

Find longest common substring from 'n' strings.
MQDataReader-class

S5-RefClass to read MaxQuant .txt files
RSD

Relative standard deviation (RSD)
byXflex

Same as byX, but with more flexible group size, to avoid that the last group has only a few entries (<50% of desired size).
boxplotCompare

Boxplots - one for each condition (=column) in a data frame.
alignmentCheck

Verify an alignment by checking the retention time differences of identical peptides across Raw files
byX

Calls FUN on a subset of data in blocks of size 'subset_size' of unique indices.
YAMLClass-class

Query a YAML object for a certain parameter.
assembleMZQC

Collects all 'mzQC' members from each entry in lst_qcMetrics and stores them in an overall mzQC object, which can be written to disk (see writeMZQC()) or augmented otherwise
ScoreInAlignWindow

Compute the fraction of features per Raw file which have an acceptable RT difference after alignment