mzagglom: Agglomerative partitioning of raw LC-HRMS measurements

Description

Agglomerative partitioning of LC-HRMS measurements. Preparatory step for mzclust and mzpick. Requires an MSlist initilialized by readMSdata as input.

Usage

mzagglom(MSlist, dmzgap = 10,  ppm = TRUE, drtgap = 500, minpeak = 4, 
	maxint=1E7, progbar=FALSE)

Arguments

MSlist

MSlist generated by readMSdata

dmzgap

m/z gap width for partitioning

ppm

dmzgap given in ppm (TRUE) or as absolute value (FALSE)?

drtgap

RT gap width for partitioning

minpeak

Minimum number of measurements in a partition

maxint

Measurements equal or above this intensity will be retained even if ranging below minpeak

progbar

For debugging, ignore

Value

Parameters: MSlist[[2]]: saves the parameter settings.
Scans: MSlist[[4]]: matrix with raw measurements and tags resorted for partitions.
Partition_Index: MSlist[[5]]: Index assigning partitions to sections in the raw measurement of MSlist[[4]]; required for fast (random) access.

Imbecile

Do not set minpeak bigger than its counterpart in mzclust or mzpick. Too complicated? Then rather use enviPickwrap for adjusting all function arguments.

Warning

Despite optimized code, this function has a potential to run for a intolerable long time or out of memory if (a) the parameters are set wrongly, (b) the .mzML/.mzXML-file was not centroided or (c) the underlying data is inadequate for this peak picker. With regards to (a), do not assume gaps being larger than actually present. Instead, use plotMSlist to have a look at your data contained in MSlist after upload with readMSdata.

Details

Partitioning of the full set of measurements into subsets is necessary to speed up the clustering procedure of mzclust. To this end, an agglomerative partitioning approach is used, combining measurements that are linked by values smaller than drtgap and dmzgap into single subsets. No measurements of two different subsets can be closer than drtgap and dmzgap to each other.