Learn R Programming

⚠️There's a newer version (1.1.2) of this package.Take me there.

PTXQC

This package allows users of MaxQuant (from .txt files) and OpenMS (from mzTab files) to generate quality control reports in Html/PDF format.

Latest changes / ChangeLog

  • v1.00.04 - Mar 2020: mzTab support for iTRAQ/TMT; minor fixes
  • v1.00.03 - Mar 2020: mzTab fixes and compatibility with upcoming R 4.0.0
  • v1.00.02 - Feb 2020: minor fixes for CRAN tests and license
  • v1.00.00 - Jan 2020: support for mzTab, more metrics (UpSetR) and fixes
  • v0.92.06 - Apr 2019: Bug Fixes
  • v0.92.05 - Mar 2019: Raw name simplification fix

See NEWS file for a version history.

Platform support

  • Windows (recommended for convenience to make use of the drag'n'drop batch file provided)
  • Linux
  • MacOSX

Citation

Please cite PTXQC when using it to check data in your publications:

Proteomics Quality Control: Quality Control Software for MaxQuant Results Chris Bielow, Guido Mastrobuoni, and Stefan Kempa J. Proteome Res., 2016, 15 (3), pp 777-787. DOI: 10.1021/acs.jproteome.5b00780

Features

  • plethora of quality metrics
    • intensity distributions
    • digestion efficiency
    • contaminant visualizations
    • identification performance
    • Match-between-runs performance
  • easy usage ([Windows OS only] drag'n'drop your txt output folder onto a batch file)
  • Html/PDF report will be generated within your MaxQuant-txt folder or next to the mzTab file
  • optional configuration file in YAML format for generation of shorter/customized reports

Target audience

  • MaxQuant users (no knowledge of R required)
  • OpenMS users (or any other software which can write an mzTab)
  • bioinformaticians (who want to contribute or customize)

Documentation

Besides this documentation on GitHub, the package vignettes of PTXQC will give you valuable information. After the package is installed (see below), you can browse the vignettes using either of these commands within R:

help(package="PTXQC")
browseVignettes(package = 'PTXQC')

If you do not want to wait that long, you can look at the latest online vignette at CRAN

You will find documentation on

  • Full List of Quality Metrics with help text
  • Input and Output
  • Report customization
  • (for MaxQuant/OpenMS users) Usage of Drag'n'drop
  • (for R users) Code examples in R

The 'List of Metrics' vignette contains a full description for each metric (as seen in the Help section of a Html report).

Installation

If you want to generate QC reports without actually getting involved in R:

We offer a Batch-file based Drag'n'drop mechanism to trigger PTXQC on any MaxQuant output folder. This only works for Windows (not Linux or MacOS) at the moment -- but you have a Windows anyway to run MaxQuant, right?! See drag'n'drop for details. It takes 10 minutes and you are done!

If you just want the package to use (and maybe even modify) it:

First, install pandoc (see bottom of linked page). Pandoc is required in order to locally build the package vignettes (documentation), but you can also read the vignettes online from the PTXQC GitHub page. More importantly, Pandoc enables PTXQC to write QC reports in HTML format (which come with a help text for each plot and are interactive). PDF reports only contain plots! The reports are printed as PDF by default and additionally as HTML if Pandoc is found. If you install Pandoc later while your R session is already open, you need to close and re-open R in order to make R aware of Pandoc!

You can grab PTXQC from either CRAN or GitHub. GitHub installation will give you the latest package; the CRAN version might be a little older, but is faster to install. Check the NEWS file for CRAN submissions and version.

For the code blocks below: Run each line separately in your R console, i.e. do not copy and paste the whole block. If an error should occur, this allows to track it down more easily. See FAQ - Installation how to resolve them.

## CRAN
install.packages("PTXQC")

or

## GitHub
if (!require(devtools, quietly = TRUE)) install.packages("devtools")
library("devtools")             ## this might give a warning like 'WARNING: Rtools is required ...'. Ignore it.

## use build_vignettes = FALSE if you did not install pandoc or if you encounter errors when building vignettes (e.g. PRIDE ftp unavailable)!
install_github("cbielow/PTXQC", build_vignettes = TRUE, dependencies = TRUE)

To get started, see the help and/or vignettes:

help(package="PTXQC")
browseVignettes(package = 'PTXQC')

Please feel free to report bugs (see below), or issue pull requests!

Report Examples

An overview chart at the beginning of the report will give you a first impression. Detailed plots can be found in the remainder of each report.

For example input data and full reports, see the 'inst/examples' subfolder.

Bug reporting / Feature requests

If you encounter a bug, e.g. error message, wrong figures, missing axis annotation or anything which looks suspicious, please use the GitHub issue tracker and file a report.

You should include

  • stage you encounter the bug, e.g. during installation, report creation, or after report creation (i.e. a bug in the report itself).
  • PDF/Html report itself (if one was generated).
  • version of PTXQC, e.g. see the report_XXX.pdf/html (where XXX will be the version) or see the DESCRIPTION file of the PTXQC package or call help(package="PTXQC") within R
  • error message (very important!). Either copy it or provide a screen shot.

Please be as precise as possible when providing the bug report - just imagine what kind of information you would like to have in order to track down the issue. In certain situations, the whole txt-folder or a single MaxQuant/mzTab file might be helpful to solve the problem.

Contributing - Get Involved!

We welcome input from our user base! There are many ways to get involved, you do not need to be a developer!

  • suggest a new metric (and why you think it's useful) by opening a new ticket here on GitHub.
  • suggest changes to existing metrics (improvements or bugfixes), see above.
  • suggest improvements to our documentation (e.g. additional vignettes)
  • write code (in R) and submit a Pull Request (PR).

Misc

Use PTXQC v0.69.3 if you want the version which was used in the paper, i.e. use install_github(..., ref="v0.69.3") when following the Installation procedure.

The input data used in the original publication is available in the 'inst/examples' subfolder.

We recommend to use the most recent PTXQC for the best user experience.

Copy Link

Version

Install

install.packages('PTXQC')

Monthly Downloads

654

Version

1.0.4

License

BSD_3_clause + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Chris Bielow

Last Published

March 28th, 2020

Functions in PTXQC (1.0.4)

MzTabReader-class

Class to read an mzTab file and store the tables internally.
MQDataReader-class

S5-RefClass to read MaxQuant .txt files
FilenameMapper-class

Make sure to call $readMappingFile(some_file) if you want to support a user-defined file mapping. Otherwise, calls to $getShortNames() will create/augment the mapping for filenames.
alignmentCheck

Verify an alignment by checking the retention time differences of identical peptides across Raw files
appendEnv

Add the value of a variable to an environment (fast append)
RSD

Relative standard deviation (RSD)
boxplotCompare

Boxplots - one for each condition (=column) in a data frame.
assignBlocks

Assign set numbers to a vector of values.
ScoreInAlignWindow

Compute the fraction of features per Raw file which have an acceptable RT difference after alignment
RTalignmentTree

Return a tree plot with a possible alignment tree.
brewer.pal.Safe

Return color brew palettes, but fail hard if number of requested colors is larger than the palette is holding.
delLCP

Removes the longest common prefix (LCP) from a vector of strings.
LCS

Compute longest common substring of two strings.
YAMLClass-class

Query a YAML object for a certain parameter.
LCSn

Find longest common substring from 'n' strings.
addGGtitle

Add title and subtitle to a ggplot
correctSetSize

Re-estimate a new set size to split a number of items into equally sized sets.
byXflex

Same as byX, but with more flexible group size, to avoid that the last group has only a few entries (<50% of desired size).
byX

Calls FUN on a subset of data in blocks of size 'subset_size' of unique indices.
computeMatchRTFractions

Combine several data structs into a final picture for segmentation incurred by 'Match-between-runs'.
findAlignReference

Return list of raw file names which were reported by MaxQuant as reference point for alignment.
fixCalibration

Detect (and fix) MaxQuant mass recalibration columns, since they sometimes report wrong values.
flattenList

Flatten lists of lists with irregular depths to just a list of items, i.e. a list of the leaves (if you consider the input as a tree).
getMetricsObjects

Get all currently available metrics
delLCS

Removes the longest common suffix (LCS) from a vector of strings.
getECDF

Estimate the empirical density and return it
getHTMLTable

Create an HTML table with an extra header row
getReportFilenames

Assembles a list of output file names, which will be created during reporting.
getQCHeatMap

Generate a Heatmap from a list of QC measurements.
getMQPARValue

Retrieve a parameter value from a mqpar.xml file
getAbundanceClass

Assign a relative abundance class to a set of (log10) abundance values
lcpCount

Count the number of chars of the longest common prefix
pastet

paste with tab as separator
getFragmentErrors

Extract fragment mass deviation errors from a data.frame from msms.txt
lcsCount

Count the number of chars of the longest common suffix
darken

Make a color (given as name or in RGB) darker by factor x = [0 = black, 1=unchanged]
longestCommonPrefix

Get the longest common prefix from a set of strings.
del0

Replace 0 with NA in a vector
plot_Charge

The plots shows the charge distribution per Raw file. The output of 'mosaicize()' can be used directly.
peakSegmentation

Determine fraction of evidence which causes segmentation, i.e. sibling peaks at different RTs confirmed either by genuine or transferred MS/MS.
plot_ContEVD

Plot contaminants from evidence.txt, broken down into top5-proteins.
idTransferCheck

Check how close transferred ID's after alignment are to their genuine IDs within one Raw file.
createReport

Create a quality control report (in PDF format).
getMetaData

Extract meta information (orderNr, metric name, category) from a list of Qc metric objects
inMatchWindow

For grouped peaks: separate them into in-width vs. out-width class.
getMaxima

Find the local maxima in a vector of numbers.
longestCommonSuffix

Like longestCommonPrefix(), but on the suffix.
mosaicize

Prepare a Mosaic plot of two columns in long format.
getPCA

Create a principal component analysis (PCA) plot for the first two dimensions.
getPeptideCounts

Extract the number of peptides observed per Raw file from an evidence table.
plot_MissedCleavages

Plot bargraph of missed cleavages.
plot_RTPeakWidth

Plot RT peak width over time
grepv

Grep with values returned instead of indices.
pointsPutX

Distribute a set of points with fixed y-values on a stretch of the x-axis.
%+%

A string concatenation function, more readable than 'paste()'.
print.PTXQC_table

helper S3 class, enabling print(some-plot_Table-object)
plot_MBRAlign

Plot MaxQuant Match-between-runs alignment performance.
plot_IonInjectionTimeOverRT

Plot line graph of TopN over Retention time.
pasten

paste with newline as separator
repEach

Repeat each element x_i in X, n_i times.
renameFile

Given a vector of (short/long) filenames, translate to the (long/short) version
plot_MBRIDtransfer

Plot MaxQuant Match-between-runs id transfer performance.
plot_MBRgain

Plot MaxQuant Match-between-runs id transfer performance.
plotTableRaw

Colored table plot.
plot_IDRate

Plot percent of identified MS/MS for each Raw file.
qualGaussDev

Compute probability of Gaussian (mu=m, sd=s) at a position 0, with reference to the max obtainable probability of that Gaussian at its center.
plot_MS2Decal

Plot bargraph of oversampled 3D-peaks.
printWithFooter

Augment a ggplot with footer text
plot_MS2Oversampling

Plot bargraph of oversampled 3D-peaks.
plot_IDsOverRT

Plot IDs over time for each Raw file.
qualHighest

Score an empirical density distribution of values, where the best possible distribution is right-skewed.
qcMetric-class

Class which can compute plots (usually for a single metric).
plot_TopNoverRT

Plot line graph of TopN over Retention time.
scale_y_discrete_reverse

Inverse the order of items on the y-axis (for discrete scales)
plot_UncalibratedMSErr

A boxplot of uncalibrated mass errors for each Raw file.
shortenStrings

Shorten a string to a maximum length and indicate shorting by appending '..'
simplifyNames

Removes common substrings (infixes) in a set of strings.
supCount

Compute shortest prefix length which makes all strings in a vector uniquely identifyable.
theme_blank

A blank theme (similar to the deprecated theme_blank())
qualUniform

Compute deviation from uniform distribution
plot_ContUser

Plot user-defined contaminants from evidence.txt
plot_CalibratedMSErr

Plot bargraph of uncalibrated mass errors for each Raw file.
thinOut

Thin out a data.frame by removing rows with similar numerical values in a certain column.
read.MQ

Convenience wrapper for MQDataReader when only a single MQ file should be read and file mapping need not be stored.
plot_ContUserScore

Plot Andromeda score distribution of contaminant peptide vs. matrix peptides.
plot_TIC

Plot Total Ion Count over time
plot_TopN

Plot line graph of TopN over Retention time.
qcMetric_MSMSScans_TopNoverRT-class

Metric for msmsscans.txt, showing TopN over RT.
getProteinCounts

Extract the number of protein groups observed per Raw file from an evidence table.
qualBestKS

From a list of vectors, compute all vs. all Kolmogorov-Smirnoff distance statistics (D)
scale_x_discrete_reverse

Inverse the order of items on the x-axis (for discrete scales)
scale01linear

Scales a vector of values linearly to [0, 1] If all input values are equal, returned values are all 0
peakWidthOverTime

Discretize RT peak widths by averaging values per time bin.
plot_ContsPG

Plot contaminants from proteinGroups.txt
ggAxisLabels

Function to thin out the number of labels shown on an axis in GGplot
ggText

Plot a text as graphic using ggplot2.
plotTable

Plot a table with row names and title
plot_CountData

Plot Protein groups per Raw file
plot_RatiosPG

Plot ratios of labeled data (e.g. SILAC) from proteinGroups.txt
plot_ScanIDRate

Plot line graph of TopN over Retention time.
qualCentered

Quality metric for 'centeredness' of a distribution around zero.
thinOutBatch

Apply 'thinOut' on all subsets of a data.frame, split by a batch column
qualLinThresh

Quality metric with linear response to input, reaching the maximum score at the given threshold.
qualCenteredRef

Quality metric for 'centeredness' of a distribution around zero with a user-supplied range threshold.
qualMedianDist

Quality metric which measures the absolute distance from median.
wait_for_writable

Check if a file is writable and blocks an interactive session, waiting for user input.
CV

Coefficient of variation (CV)
PTXQC

PTXQC: A package for computing Quality Control (QC) metrics for Proteomics (PTX)