Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets

Hoffmann N, Keck M, Neuweger H, Wilhelm M, Högy P, Niehaus K, Stoye J (2012)
BMC Bioinformatics 13(1): 21.

Zeitschriftenaufsatz | Veröffentlicht | Englisch
 
Download
OA
Abstract / Bemerkung
Background Modern analytical methods in biology and chemistry use separation techniques coupled to sensitive detectors, such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). These hyphenated methods provide high-dimensional data. Comparing such data manually to find corresponding signals is a laborious task, as each experiment usually consists of thousands of individual scans, each containing hundreds or even thousands of distinct signals. In order to allow for successful identification of metabolites or proteins within such data, especially in the context of metabolomics and proteomics, an accurate alignment and matching of corresponding features between two or more experiments is required. Such a matching algorithm should capture fluctuations in the chromatographic system which lead to non-linear distortions on the time axis, as well as systematic changes in recorded intensities. Many different algorithms for the retention time alignment of GC-MS and LC-MS data have been proposed and published, but all of them focus either on aligning previously extracted peak features or on aligning and comparing the complete raw data containing all available features. Results In this paper we introduce two algorithms for retention time alignment of multiple GC-MS datasets: multiple alignment by bidirectional best hits peak assignment and cluster extension (BIPACE) and center-star multiple alignment by pairwise partitioned dynamic time warping (CEMAPP-DTW). We show how the similarity-based peak group matching method BIPACE may be used for multiple alignment calculation individually and how it can be used as a preprocessing step for the pairwise alignments performed by CEMAPP-DTW. We evaluate the algorithms individually and in combination on a previously published small GC-MS dataset studying the Leishmania parasite and on a larger GC-MS dataset studying grains of wheat (Triticum aestivum). Conclusions We have shown that BIPACE achieves very high precision and recall and a very low number of false positive peak assignments on both evaluation datasets. CEMAPP-DTW finds a high number of true positives when executed on its own, but achieves even better results when BIPACE is used to constrain its search space. The source code of both algorithms is included in the OpenSource software framework Maltcms, which is available from http://maltcms.sf.net webcite. The evaluation scripts of the present study are available from the same source.
Erscheinungsjahr
2012
Zeitschriftentitel
BMC Bioinformatics
Band
13
Ausgabe
1
Art.-Nr.
21
ISSN
1471-2105
Finanzierungs-Informationen
Open-Access-Publikationskosten wurden durch die Deutsche Forschungsgemeinschaft und die Universität Bielefeld gefördert.
Page URI
https://pub.uni-bielefeld.de/record/2517239

Zitieren

Hoffmann N, Keck M, Neuweger H, et al. Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets. BMC Bioinformatics. 2012;13(1): 21.
Hoffmann, N., Keck, M., Neuweger, H., Wilhelm, M., Högy, P., Niehaus, K., & Stoye, J. (2012). Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets. BMC Bioinformatics, 13(1), 21. doi:10.1186/1471-2105-13-21
Hoffmann, Nils, Keck, Matthias, Neuweger, Heiko, Wilhelm, Mathias, Högy, Petra, Niehaus, Karsten, and Stoye, Jens. 2012. “Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets”. BMC Bioinformatics 13 (1): 21.
Hoffmann, N., Keck, M., Neuweger, H., Wilhelm, M., Högy, P., Niehaus, K., and Stoye, J. (2012). Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets. BMC Bioinformatics 13:21.
Hoffmann, N., et al., 2012. Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets. BMC Bioinformatics, 13(1): 21.
N. Hoffmann, et al., “Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets”, BMC Bioinformatics, vol. 13, 2012, : 21.
Hoffmann, N., Keck, M., Neuweger, H., Wilhelm, M., Högy, P., Niehaus, K., Stoye, J.: Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets. BMC Bioinformatics. 13, : 21 (2012).
Hoffmann, Nils, Keck, Matthias, Neuweger, Heiko, Wilhelm, Mathias, Högy, Petra, Niehaus, Karsten, and Stoye, Jens. “Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets”. BMC Bioinformatics 13.1 (2012): 21.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Volltext(e)
Access Level
OA Open Access
Zuletzt Hochgeladen
2019-09-06T09:18:04Z
MD5 Prüfsumme
2fc4d55b4cbdd3a9e8f0a99e8a330c60


11 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

Elucidation of chromatographic peak shifts in complex samples using a chemometrical approach.
Sousa PFM, de Waard A, Åberg KM., Anal Bioanal Chem 410(21), 2018
PMID: 29947907
Supporting metabolomics with adaptable software: design architectures for the end-user.
Sarpe V, Schriemer DC., Curr Opin Biotechnol 43(), 2017
PMID: 27870998
Joint Bounding of Peaks Across Samples Improves Differential Analysis in Mass Spectrometry-Based Metabolomics.
Myint L, Kleensang A, Zhao L, Hartung T, Hansen KD., Anal Chem 89(6), 2017
PMID: 28221771
Further development of biomarkers in amyotrophic lateral sclerosis.
Blasco H, Vourc'h P, Pradat PF, Gordon PH, Andres CR, Corcia P., Expert Rev Mol Diagn 16(8), 2016
PMID: 27275785
A hybrid retention time alignment algorithm for SWATH-MS data.
Wu L, Amon S, Lam H., Proteomics 16(15-16), 2016
PMID: 27302277
Comparative analysis of targeted metabolomics: dominance-based rough set approach versus orthogonal partial least square-discriminant analysis.
Blasco H, Błaszczyński J, Billaut JC, Nadal-Desbarats L, Pradat PF, Devos D, Moreau C, Andres CR, Emond P, Corcia P, Słowiński R., J Biomed Inform 53(), 2015
PMID: 25499899
BiPACE 2D--graph-based multiple alignment for comprehensive 2D gas chromatography-mass spectrometry.
Hoffmann N, Wilhelm M, Doebbe A, Niehaus K, Stoye J., Bioinformatics 30(7), 2014
PMID: 24363380
Nonlinear alignment of chromatograms by means of moving window fast Fourier transfrom cross-correlation.
Li Z, Wang JJ, Huang J, Zhang ZM, Lu HM, Zheng YB, Zhan DJ, Liang YZ., J Sep Sci 36(9-10), 2013
PMID: 23436496

35 References

Daten bereitgestellt von Europe PubMed Central.

Retention time alignment algorithms for LC/MS data must consider non-linear shifts.
Podwojski K, Fritsch A, Chamrad DC, Paul W, Sitek B, Stuhler K, Mutzel P, Stephan C, Meyer HE, Urfer W, Ickstadt K, Rahnenfuhrer J., Bioinformatics 25(6), 2009
PMID: 19176558
Retention index thresholds for compound matching in GC-MS metabolite profiling.
Strehmel N, Hummel J, Erban A, Strassburg K, Kopka J., J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 871(2), 2008
PMID: 18501684
Systematic identification of conserved metabolites in GC/MS data for metabolomics and biomarker discovery.
Styczynski MP, Moxley JF, Tong LV, Walther JL, Jensen KL, Stephanopoulos GN., Anal. Chem. 79(3), 2007
PMID: 17263323
A geometric approach for the alignment of liquid chromatography-mass spectrometry data.
Lange E, Gropl C, Schulz-Trieglaff O, Leinenbach A, Huber C, Reinert K., Bioinformatics 23(13), 2007
PMID: 17646306
Alignment of gas chromatography-mass spectrometry data by landmark selection from complex chemical mixtures
AUTHOR UNKNOWN, 2006
Signal maps for mass spectrometry-based comparative proteomics
AUTHOR UNKNOWN, 2006
A dynamic programming approach for the alignment of signal peaks in multiple gas chromatography-mass spectrometry experiments.
Robinson MD, De Souza DP, Keen WW, Saunders EC, McConville MJ, Speed TP, Likic VA., BMC Bioinformatics 8(), 2007
PMID: 17963529
Critical assessment of alignment procedures for LC-MS proteomics and metabolomics measurements.
Lange E, Tautenhahn R, Neumann S, Gropl C., BMC Bioinformatics 9(), 2008
PMID: 18793413
Robust algorithm for alignment of liquid chromatography-mass spectrometry analyses in an accurate mass and time tag data analysis pipeline.
Jaitly N, Monroe ME, Petyuk VA, Clauss TR, Adkins JN, Smith RD., Anal. Chem. 78(21), 2006
PMID: 17073405
The correspondence problem for metabonomics datasets.
Aberg KM, Alm E, Torgrip RJ., Anal Bioanal Chem 394(1), 2009
PMID: 19198812
Time alignment algorithms based on selected mass traces for complex LC-MS data.
Christin C, Hoefsloot HC, Smilde AK, Suits F, Bischoff R, Horvatovich PL., J. Proteome Res. 9(3), 2010
PMID: 20070124
Dynamic time warping of spectroscopic BATCH data
AUTHOR UNKNOWN, 2003
Alignment using variable penalty dynamic time warping.
Clifford D, Stone G, Montoliu I, Rezzi S, Martin FP, Guy P, Bruce S, Kochhar S., Anal. Chem. 81(3), 2009
PMID: 19138127
Parametric time warping.
Eilers PH., Anal. Chem. 76(2), 2004
PMID: 14719890
A genomic perspective on protein families.
Tatusov RL, Koonin EV, Lipman DJ., Science 278(5338), 1997
PMID: 9381173
The use of gene clusters to infer functional coupling.
Overbeek R, Fonstein M, D'Souza M, Pusch GD, Maltsev N., Proc. Natl. Acad. Sci. U.S.A. 96(6), 1999
PMID: 10077608
Minimum Prediction Residual Principle Applied to Speech Recognition
AUTHOR UNKNOWN, 1975
Dynamic Programming Algorithm Optimization for Spoken Word Recognition
AUTHOR UNKNOWN, 1978
The symmetric time-warping problem: from continuous to discrete
AUTHOR UNKNOWN, 1983
Characterization of normal human cells by pyrolysis gas chromatography mass spectrometry.
Reiner E, Abbey LE, Moran TF, Papamichalis P, Schafer RW., Biomed. Mass Spectrom. 6(11), 1979
PMID: 394768
Multiple alignment of continuous time series
AUTHOR UNKNOWN, 2005
Time-series alignment by non-negative multiple generalized canonical correlation analysis.
Fischer B, Roth V, Buhmann JM., BMC Bioinformatics 8 Suppl 10(), 2007
PMID: 18269698
Reducibility Among Combinatorial Problems
AUTHOR UNKNOWN, 1972
Complexity Results on Graphs with Few Cliques
AUTHOR UNKNOWN, 2007
Optimized time alignment algorithm for LC-MS data: correlation optimized warping using component detection algorithm-selected mass chromatograms.
Christin C, Smilde AK, Hoefsloot HC, Suits F, Bischoff R, Horvatovich PL., Anal. Chem. 80(18), 2008
PMID: 18715018
Symmetric time warping, Boltzmann pair probabilities and functional genomics.
Clote P, Straubhaar J., J Math Biol 53(1), 2006
PMID: 16791652
Effects of atmospheric CO2 enrichment on biomass, yield and low molecular weight metabolites in wheat grain
Hogy P, Keck M, Niehaus K, Franzaring J, Fangmeier A., J. Cereal Sci. 52(2), 2010
PMID: IND44432855
Effects of elevated CO2 on grain yield and quality of wheat: results from a 3-year free-air CO2 enrichment experiment.
Hogy P, Wieser H, Kohler P, Schwadorf K, Breuer J, Franzaring J, Muntifering R, Fangmeier A., Plant Biol (Stuttg) 11 Suppl 1(), 2009
PMID: 19778369
MeltDB: a software platform for the analysis and integration of metabolomics experiment data.
Neuweger H, Albaum SP, Dondrup M, Persicke M, Watt T, Niehaus K, Stoye J, Goesmann A., Bioinformatics 24(23), 2008
PMID: 18765459
Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®
Quellen

PMID: 22920415
PubMed | Europe PMC

Suchen in

Google Scholar