Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics

Timm W, Scherbart A, Boecker S, Kohlbacher O, Nattkemper TW (2008)
BMC Bioinformatics 9(1).

Download
OA
Journal Article | Published | English
Author
; ; ; ;
Abstract
Background: Mass spectrometry is a key technique in proteomics and can be used to analyze complex samples quickly. One key problem with the mass spectrometric analysis of peptides and proteins, however, is the fact that absolute quantification is severely hampered by the unclear relationship between the observed peak intensity and the peptide concentration in the sample. While there are numerous approaches to circumvent this problem experimentally (e. g. labeling techniques), reliable prediction of the peak intensities from peptide sequences could provide a peptide-specific correction factor. Thus, it would be a valuable tool towards label-free absolute quantification. Results: In this work we present machine learning techniques for peak intensity prediction for MALDI mass spectra. Features encoding the peptides' physico-chemical properties as well as string-based features were extracted. A feature subset was obtained from multiple forward feature selections on the extracted features. Based on these features, two advanced machine learning methods (support vector regression and local linear maps) are shown to yield good results for this problem (Pearson correlation of 0.68 in a ten-fold cross validation). Conclusion: The techniques presented here are a useful first step going beyond the binary prediction of proteotypic peptides towards a more quantitative prediction of peak intensities. These predictions in turn will turn out to be beneficial for mass spectrometry-based quantitative proteomics.
Publishing Year
ISSN
PUB-ID

Cite this

Timm W, Scherbart A, Boecker S, Kohlbacher O, Nattkemper TW. Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics. 2008;9(1).
Timm, W., Scherbart, A., Boecker, S., Kohlbacher, O., & Nattkemper, T. W. (2008). Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics, 9(1).
Timm, W., Scherbart, A., Boecker, S., Kohlbacher, O., and Nattkemper, T. W. (2008). Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics 9.
Timm, W., et al., 2008. Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics, 9(1).
W. Timm, et al., “Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics”, BMC Bioinformatics, vol. 9, 2008.
Timm, W., Scherbart, A., Boecker, S., Kohlbacher, O., Nattkemper, T.W.: Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics. 9, (2008).
Timm, Wiebke, Scherbart, Alexandra, Boecker, Sebastian, Kohlbacher, Oliver, and Nattkemper, Tim Wilhelm. “Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics”. BMC Bioinformatics 9.1 (2008).
Main File(s)
File Name
Access Level
OA Open Access

This data publication is cited in the following publications:
This publication cites the following data publications:

9 Citations in Europe PMC

Data provided by Europe PubMed Central.

Review of software tools for design and analysis of large scale MRM proteomic datasets.
Colangelo CM, Chung L, Bruce C, Cheung KH., Methods 61(3), 2013
PMID: 23702368
Tools for label-free peptide quantification.
Nahnsen S, Bielow C, Reinert K, Kohlbacher O., Mol. Cell Proteomics 12(3), 2013
PMID: 23250051
A systematic model of the LC-MS proteomics pipeline.
Sun Y, Braga-Neto U, Dougherty ER., BMC Genomics 13 Suppl 6(), 2012
PMID: 23134670
Feature-matching pattern-based support vector machines for robust peptide mass fingerprinting.
Li Y, Hao P, Zhang S, Li Y., Mol. Cell Proteomics 10(12), 2011
PMID: 21775775
MALDI immunoscreening (MiSCREEN): a method for selection of anti-peptide monoclonal antibodies for use in immunoproteomics.
Razavi M, Pope ME, Soste MV, Eyford BA, Jackson AM, Anderson NL, Pearson TW., J. Immunol. Methods 364(1-2), 2011
PMID: 21078325
Image analysis tools and emerging algorithms for expression proteomics.
Dowsey AW, English JA, Lisacek F, Morris JS, Yang GZ, Dunn MJ., Proteomics 10(23), 2010
PMID: 21046614
Oncoproteomic profiling with antibody microarrays.
Alhamdani MS, Schroder C, Hoheisel JD., Genome Med 1(7), 2009
PMID: 19591665

45 References

Data provided by Europe PubMed Central.


Chambers JM, Hastie TJ., 1992

Vapnik VN., 1995

AUTHOR UNKNOWN, 2006

Dimitriadou E, Hornik K, Leisch F, Meyer D, Weingessel A., 2006
Self-Organized Formation of Topologically Correct Feature Maps
Kohonen T., 1982
SOM-based Peptide Prototyping for Mass Spectrometry Peak Intensity Prediction
Scherbart A, Timm W, Böcker S, Nattkemper TW., 2007
Locally-Weighted Regression: An Approach to Regression Analysis by Local Fitting
Cleveland WS, Devlin SJ., 1988
Associative Reinforcement Learning for Optimal Control
Millington PJ, Baker WL., 1990
Local regression: Automatic kernel carpentry
Hastie T, Loader C., 1993
AAindex: Amino Acid Index Database.
Kawashima S, Ogata H, Kanehisa M., Nucleic Acids Res. 27(1), 1999
PMID: 9847231

Hastie T, Tibshirani R, Friedman J., 2001
Computed Conformational States of the 20 Naturally Occuring Amino Acid Residues and of the Prototype Residue -Aminobutyric Acid
Vásquez M, Némethy G, Scheraga HA., 2001
Prediction of protein surface accessibility with information theory.
Naderi-Manesh H, Sadeghi M, Arab S, Moosavi Movahedi AA., Proteins 42(4), 2001
PMID: 11170200
Physicochemical Basis of Amino Acid Hydrophobicity Scales: Evaluation of Four New Scales of Amino Acid Hydrophobicity Coefficients Derived from RP-HPLC of Peptides
Wilce MCJ, Aguilar MI, Hearn MTW., 1995
Amino acid side chain parameters for correlation studies in biology and pharmacology.
Fauchere JL, Charton M, Kier LB, Verloop A, Pliska V., Int. J. Pept. Protein Res. 32(4), 1988
PMID: 3209351
Random Forests
Breiman L., 2001

Breiman L., 2002
The Kerr Effect of Amino Acids in Water
Khanarian G, Moore WJ., 1980
GenDB--an open source genome annotation system for prokaryote genomes.
Meyer F, Goesmann A, McHardy AC, Bartels D, Bekel T, Clausen J, Kalinowski J, Linke B, Rupp O, Giegerich R, Puhler A., Nucleic Acids Res. 31(8), 2003
PMID: 12682369

Export

0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®

Sources

PMID: 18937839
PubMed | Europe PMC

Search this title in

Google Scholar