Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics

Timm W, Scherbart A, Boecker S, Kohlbacher O, Nattkemper TW (2008)
BMC Bioinformatics 9(1): 443.

Download
OA
Zeitschriftenaufsatz | Veröffentlicht | Englisch
Volltext vorhanden für diesen Nachweis
Autor
; ; ; ;
Abstract / Bemerkung
Background: Mass spectrometry is a key technique in proteomics and can be used to analyze complex samples quickly. One key problem with the mass spectrometric analysis of peptides and proteins, however, is the fact that absolute quantification is severely hampered by the unclear relationship between the observed peak intensity and the peptide concentration in the sample. While there are numerous approaches to circumvent this problem experimentally (e. g. labeling techniques), reliable prediction of the peak intensities from peptide sequences could provide a peptide-specific correction factor. Thus, it would be a valuable tool towards label-free absolute quantification. Results: In this work we present machine learning techniques for peak intensity prediction for MALDI mass spectra. Features encoding the peptides' physico-chemical properties as well as string-based features were extracted. A feature subset was obtained from multiple forward feature selections on the extracted features. Based on these features, two advanced machine learning methods (support vector regression and local linear maps) are shown to yield good results for this problem (Pearson correlation of 0.68 in a ten-fold cross validation). Conclusion: The techniques presented here are a useful first step going beyond the binary prediction of proteotypic peptides towards a more quantitative prediction of peak intensities. These predictions in turn will turn out to be beneficial for mass spectrometry-based quantitative proteomics.
Erscheinungsjahr
Zeitschriftentitel
BMC Bioinformatics
Band
9
Ausgabe
1
Seite(n)
443
ISSN
PUB-ID

Zitieren

Timm W, Scherbart A, Boecker S, Kohlbacher O, Nattkemper TW. Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics. 2008;9(1):443.
Timm, W., Scherbart, A., Boecker, S., Kohlbacher, O., & Nattkemper, T. W. (2008). Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics, 9(1), 443. doi:10.1186/1471-2105-9-443
Timm, W., Scherbart, A., Boecker, S., Kohlbacher, O., and Nattkemper, T. W. (2008). Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics 9, 443.
Timm, W., et al., 2008. Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics, 9(1), p 443.
W. Timm, et al., “Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics”, BMC Bioinformatics, vol. 9, 2008, pp. 443.
Timm, W., Scherbart, A., Boecker, S., Kohlbacher, O., Nattkemper, T.W.: Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics. 9, 443 (2008).
Timm, Wiebke, Scherbart, Alexandra, Boecker, Sebastian, Kohlbacher, Oliver, and Nattkemper, Tim Wilhelm. “Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics”. BMC Bioinformatics 9.1 (2008): 443.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Volltext(e)
Access Level
OA Open Access
Zuletzt Hochgeladen
1970-01-01T00:00:00Z

11 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

Modeling and systematic analysis of biomarker validation using selected reaction monitoring.
Atashpaz-Gargari E, Braga-Neto UM, Dougherty ER., EURASIP J Bioinform Syst Biol 2014(), 2014
PMID: 28194167
Tools for label-free peptide quantification.
Nahnsen S, Bielow C, Reinert K, Kohlbacher O., Mol Cell Proteomics 12(3), 2013
PMID: 23250051
Review of software tools for design and analysis of large scale MRM proteomic datasets.
Colangelo CM, Chung L, Bruce C, Cheung KH., Methods 61(3), 2013
PMID: 23702368
A systematic model of the LC-MS proteomics pipeline.
Sun Y, Braga-Neto U, Dougherty ER., BMC Genomics 13 Suppl 6(), 2012
PMID: 23134670
MALDI immunoscreening (MiSCREEN): a method for selection of anti-peptide monoclonal antibodies for use in immunoproteomics.
Razavi M, Pope ME, Soste MV, Eyford BA, Jackson AM, Anderson NL, Pearson TW., J Immunol Methods 364(1-2), 2011
PMID: 21078325
Feature-matching pattern-based support vector machines for robust peptide mass fingerprinting.
Li Y, Hao P, Zhang S, Li Y., Mol Cell Proteomics 10(12), 2011
PMID: 21775775
Image analysis tools and emerging algorithms for expression proteomics.
Dowsey AW, English JA, Lisacek F, Morris JS, Yang GZ, Dunn MJ., Proteomics 10(23), 2010
PMID: 21046614
Oncoproteomic profiling with antibody microarrays.
Alhamdani MS, Schröder C, Hoheisel JD., Genome Med 1(7), 2009
PMID: 19591665

45 References

Daten bereitgestellt von Europe PubMed Central.

Quantitative mass spectrometry in proteomics: a critical review.
Bantscheff M, Schirle M, Sweetman G, Rick J, Kuster B., Anal Bioanal Chem 389(4), 2007
PMID: 17668192
Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics.
Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M., Mol. Cell Proteomics 1(5), 2002
PMID: 12118079
Quantitative analysis of complex protein mixtures using isotope-coded affinity tags.
Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R., Nat. Biotechnol. 17(10), 1999
PMID: 10504701
Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents.
Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, Khainovski N, Pillai S, Dey S, Daniels S, Purkayastha S, Juhasz P, Martin S, Bartlet-Jones M, He F, Jacobson A, Pappin DJ., Mol. Cell Proteomics 3(12), 2004
PMID: 15385600
Proteolytic 18O labeling for comparative proteomics: model studies with two serotypes of adenovirus.
Yao X, Freas A, Ramirez J, Demirev PA, Fenselau C., Anal. Chem. 73(13), 2001
PMID: 11467524
Comparative LC-MS: a landscape of peaks and valleys.
America AH, Cordewener JH., Proteomics 8(4), 2008
PMID: 18297651
Absolute myoglobin quantitation in serum by combining two-dimensional liquid chromatography-electrospray ionization mass spectrometry and novel data analysis algorithms.
Mayr BM, Kohlbacher O, Reinert K, Sturm M, Gropl C, Lange E, Klein C, Huber CG., J. Proteome Res. 5(2), 2006
PMID: 16457608
Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS.
Gerber SA, Rush J, Stemman O, Kirschner MW, Gygi SP., Proc. Natl. Acad. Sci. U.S.A. 100(12), 2003
PMID: 12771378
Label-free detection of differential protein expression by LC/MALDI mass spectrometry.
Neubert H, Bonnert TP, Rumpel K, Hunt BT, Henle ES, James IT., J. Proteome Res. 7(6), 2008
PMID: 18412385
A computational approach toward label-free protein quantification using predicted peptide detectability.
Tang H, Arnold RJ, Alves P, Xun Z, Clemmer DE, Novotny MV, Reilly JP, Radivojac P., Bioinformatics 22(14), 2006
PMID: 16873510
Computational prediction of proteotypic peptides for quantitative proteomics.
Mallick P, Schirle M, Chen SS, Flory MR, Lee H, Martin D, Ranish J, Raught B, Schmitt R, Werner T, Kuster B, Aebersold R., Nat. Biotechnol. 25(1), 2006
PMID: 17195840
Peptide mass fingerprinting peak intensity prediction: extracting knowledge from spectra.
Gay S, Binz PA, Hochstrasser DF, Appel RD., Proteomics 2(10), 2002
PMID: 12422355
Rapid identification of proteins by peptide-mass fingerprinting.
Pappin DJ, Hojrup P, Bleasby AJ., Curr. Biol. 3(6), 1993
PMID: 15335725
Smoothing and Differentiation of Data by Simplified Least Squares Procedures
Savitzky A, Golay JEM., 1964
Informatics platform for global proteomic profiling and biomarker discovery using liquid chromatography-tandem mass spectrometry.
Radulovic D, Jelveh S, Ryu S, Hamilton TG, Foss E, Mao Y, Emili A., Mol. Cell Proteomics 3(10), 2004
PMID: 15269249
Quantitation of SR 27417 in Human Plasma Using Electrospray Liquid Chromatography-Tandem Mass Spectrometry: A Study of Ion Suppression
Buhrman D, Price P, Rudewicz P., 1996
Shrinking the Tube: A New Support Vector Regression Algorithm
Schölkopf B, Bartlett P, Smola A, Williamson R., 1999
A Tutorial on Support Vector Machines for Pattern Recognition
Burges CJ., 1998
Learning with the Self-Organizing Map
Ritter H., 1991

Chambers JM, Hastie TJ., 1992

Vapnik VN., 1995

AUTHOR UNKNOWN, 2006

Dimitriadou E, Hornik K, Leisch F, Meyer D, Weingessel A., 2006
Self-Organized Formation of Topologically Correct Feature Maps
Kohonen T., 1982
SOM-based Peptide Prototyping for Mass Spectrometry Peak Intensity Prediction
Scherbart A, Timm W, Böcker S, Nattkemper TW., 2007
Locally-Weighted Regression: An Approach to Regression Analysis by Local Fitting
Cleveland WS, Devlin SJ., 1988
Associative Reinforcement Learning for Optimal Control
Millington PJ, Baker WL., 1990
Local regression: Automatic kernel carpentry
Hastie T, Loader C., 1993
AAindex: Amino Acid Index Database.
Kawashima S, Ogata H, Kanehisa M., Nucleic Acids Res. 27(1), 1999
PMID: 9847231

Hastie T, Tibshirani R, Friedman J., 2001
Computed Conformational States of the 20 Naturally Occuring Amino Acid Residues and of the Prototype Residue -Aminobutyric Acid
Vásquez M, Némethy G, Scheraga HA., 2001
Prediction of protein surface accessibility with information theory.
Naderi-Manesh H, Sadeghi M, Arab S, Moosavi Movahedi AA., Proteins 42(4), 2001
PMID: 11170200
Physicochemical Basis of Amino Acid Hydrophobicity Scales: Evaluation of Four New Scales of Amino Acid Hydrophobicity Coefficients Derived from RP-HPLC of Peptides
Wilce MCJ, Aguilar MI, Hearn MTW., 1995
Amino acid side chain parameters for correlation studies in biology and pharmacology.
Fauchere JL, Charton M, Kier LB, Verloop A, Pliska V., Int. J. Pept. Protein Res. 32(4), 1988
PMID: 3209351
Random Forests
Breiman L., 2001

Breiman L., 2002
The Kerr Effect of Amino Acids in Water
Khanarian G, Moore WJ., 1980
GenDB--an open source genome annotation system for prokaryote genomes.
Meyer F, Goesmann A, McHardy AC, Bartels D, Bekel T, Clausen J, Kalinowski J, Linke B, Rupp O, Giegerich R, Puhler A., Nucleic Acids Res. 31(8), 2003
PMID: 12682369

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®

Quellen

PMID: 18937839
PubMed | Europe PMC

Suchen in

Google Scholar