Statistics and algorithms for peptide mass fingerprinting

Kaltenbach H-M (2007)
Bielefeld (Germany): Bielefeld University.

Bielefelder E-Dissertation | Englisch
 
Download
OA
Autor*in
Kaltenbach, Hans-Michael
Gutachter*in / Betreuer*in
Böcker, Sebastian
Abstract / Bemerkung
We investigate several mathematical and algorithmical aspects of peptide mass fingerprinting (PMF). In a PMF experiment, a purified protein sample is digested by a protease using an enzymatic cleavage reaction, the masses of the resulting peptides are measured by mass spectrometry, yielding the mass fingerprint, and compared to predicted mass fingerprints of reference protein sequences. In the first part, we examine several statistics of PMFs. We introduce random weighted strings over probabilistically weighted alphabets and cleavage schemes as mathematical models for random protein sequences, their molecular mass, and their mass fingerprints. We examine weighted hidden Markov models and Markov additive chains as a general computational framework for the stochastics of protein fragments and their masses. In parallel, we present recurrence equations for the description of these fragments. Using these models, the distribution of fragment lengths, the distribution of fragment masses and the distribution of the number of fragments in a random protein are examined under a random string model of independent, identically distributed characters. We derive the occurrence probability of a certain fragment mass in a random protein sequence of either fixed length or fixed mass. We present efficient dynamic programming algorithms and their time and space complexity for most of the statistics and compare all statistics with their empirical counterparts estimated from an in-silico tryptic digest of the Swiss-Prot protein sequence database. In the second part, we develop a general algorithmic framework for identification of PMFs with a protein sequence database search. We formalize the identification of a mass spectrum as an alignment problem and modify well-known methods developed for sequence analysis to an efficient algorithm for computing an optimal alignment of a measured spectrum and a predicted spectrum. The alignment is based on scoring schemes that allow flexible and consistent incorporation of a multitude of experimental parameters such as accuracy of the measured masses, mass error distribution, sample contamination, and ionization efficiency into the identification procedure. Using the fragment statistics of the first part, we estimate the significance of a protein identification under a well-defined statistical null-model. Finally, we present a family of scoring schemes and show how additional information such as intensity values of measured peaks can be incorporated into an alignment scoring. We demonstrate the applicability of the alignment framework on a real proteomics dataset and compare our results with the standard software MASCOT.
Jahr
2007
Page URI
https://pub.uni-bielefeld.de/record/2305520

Zitieren

Kaltenbach H-M. Statistics and algorithms for peptide mass fingerprinting. Bielefeld (Germany): Bielefeld University; 2007.
Kaltenbach, H. - M. (2007). Statistics and algorithms for peptide mass fingerprinting. Bielefeld (Germany): Bielefeld University.
Kaltenbach, Hans-Michael. 2007. Statistics and algorithms for peptide mass fingerprinting. Bielefeld (Germany): Bielefeld University.
Kaltenbach, H. - M. (2007). Statistics and algorithms for peptide mass fingerprinting. Bielefeld (Germany): Bielefeld University.
Kaltenbach, H.-M., 2007. Statistics and algorithms for peptide mass fingerprinting, Bielefeld (Germany): Bielefeld University.
H.-M. Kaltenbach, Statistics and algorithms for peptide mass fingerprinting, Bielefeld (Germany): Bielefeld University, 2007.
Kaltenbach, H.-M.: Statistics and algorithms for peptide mass fingerprinting. Bielefeld University, Bielefeld (Germany) (2007).
Kaltenbach, Hans-Michael. Statistics and algorithms for peptide mass fingerprinting. Bielefeld (Germany): Bielefeld University, 2007.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Volltext(e)
Access Level
OA Open Access
Zuletzt Hochgeladen
2019-09-25T06:27:05Z
MD5 Prüfsumme
dfac86db0a177adf0fbf28fc43559718


Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar