Semantics and Ambiguity of Stochastic RNA Family Models

Giegerich R, Hoener Zu Siederdissen C (2011)
IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(2): 499-516.

Zeitschriftenaufsatz | Veröffentlicht | Englisch
 
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Giegerich, RobertUniBi; Hoener Zu Siederdissen, Christian
Abstract / Bemerkung
Stochastic models, such as hidden Markov models or stochastic context-free grammars (SCFGs) can fail to return the correct, maximum likelihood solution in the case of semantic ambiguity. This problem arises when the algorithm implementing the model inspects the same solution in different guises. It is a difficult problem in the sense that proving semantic nonambiguity has been shown to be algorithmically undecidable, while compensating for it (by coalescing scores of equivalent solutions) has been shown to be NP-hard. For stochastic context-free grammars modeling RNA secondary structure, it has been shown that the distortion of results can be quite severe. Much less is known about the case when stochastic context-free grammars model the matching of a query sequence to an implicit consensus structure for an RNA family. We find that three different, meaningful semantics can be associated with the matching of a query against the model-a structural, an alignment, and a trace semantics. Rfam models correctly implement the alignment semantics, and are ambiguous with respect to the other two semantics, which are more abstract. We show how provably correct models can be generated for the trace semantics. For approaches, where such a proof is not possible, we present an automated pipeline to check post factum for ambiguity of the generated models. We propose that both the structure and the trace semantics are worth-while concepts for further study, possibly better suited to capture remotely related family members.
Stichworte
RNA secondary structure; covariance models; ambiguity; semantic; RNA family models
Erscheinungsjahr
2011
Zeitschriftentitel
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Band
8
Ausgabe
2
Seite(n)
499-516
ISSN
1545-5963
Page URI
https://pub.uni-bielefeld.de/record/2003355

Zitieren

Giegerich R, Hoener Zu Siederdissen C. Semantics and Ambiguity of Stochastic RNA Family Models. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2011;8(2):499-516.
Giegerich, R., & Hoener Zu Siederdissen, C. (2011). Semantics and Ambiguity of Stochastic RNA Family Models. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8(2), 499-516. https://doi.org/10.1109/TCBB.2010.12
Giegerich, Robert, and Hoener Zu Siederdissen, Christian. 2011. “Semantics and Ambiguity of Stochastic RNA Family Models”. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8 (2): 499-516.
Giegerich, R., and Hoener Zu Siederdissen, C. (2011). Semantics and Ambiguity of Stochastic RNA Family Models. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8, 499-516.
Giegerich, R., & Hoener Zu Siederdissen, C., 2011. Semantics and Ambiguity of Stochastic RNA Family Models. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8(2), p 499-516.
R. Giegerich and C. Hoener Zu Siederdissen, “Semantics and Ambiguity of Stochastic RNA Family Models”, IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, 2011, pp. 499-516.
Giegerich, R., Hoener Zu Siederdissen, C.: Semantics and Ambiguity of Stochastic RNA Family Models. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 8, 499-516 (2011).
Giegerich, Robert, and Hoener Zu Siederdissen, Christian. “Semantics and Ambiguity of Stochastic RNA Family Models”. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8.2 (2011): 499-516.

3 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

Product Grammars for Alignment and Folding.
Höner Zu Siederdissen C, Hofacker IL, Stadler PF., IEEE/ACM Trans Comput Biol Bioinform 12(3), 2015
PMID: 26357262
Ambivalent covariance models.
Janssen S, Giegerich R., BMC Bioinformatics 16(), 2015
PMID: 26017195

22 References

Daten bereitgestellt von Europe PubMed Central.

Effective ambiguity checking in biosequence analysis.
Reeder J, Steffen P, Giegerich R., BMC Bioinformatics 6(), 2005
PMID: 15967024

baldi, Bioinformatics the Machine Learning Approach (), 1998
HMMER User's Guide
eddy, 2003
The Pfam protein families database.
Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL., Nucleic Acids Res. 30(1), 2002
PMID: 11752314
Stochastic context-free grammars for tRNA modeling.
Sakakibara Y, Brown M, Hughey R, Mian IS, Sjolander K, Underwood RC, Haussler D., Nucleic Acids Res. 22(23), 1994
PMID: 7800507

sankoff, Time Warps String Edits and Macromolecules (), 1983
Rfam: updates to the RNA families database.
Gardner PP, Daub J, Tate JG, Nawrocki EP, Kolbe DL, Lindgreen S, Wilkinson AC, Finn RD, Griffiths-Jones S, Eddy SR, Bateman A., Nucleic Acids Res. 37(Database issue), 2008
PMID: 18953034
Profile hidden Markov models.
Eddy SR., Bioinformatics 14(9), 1998
PMID: 9918945
RNA sequence analysis using covariance models.
Eddy SR, Durbin R., Nucleic Acids Res. 22(11), 1994
PMID: 8029015

AUTHOR UNKNOWN, 0

AUTHOR UNKNOWN, 0

AUTHOR UNKNOWN, 0

AUTHOR UNKNOWN, 0
Explaining and Controlling Ambiguity in Dynamic Programming
giegerich, Proc 11th Ann Symp Combinatorial Pattern Matching (), 2000

waterman, Introduction to Computational Biology (), 1994
Pfold: RNA secondary structure prediction using stochastic context-free grammars.
Knudsen B, Hein J., Nucleic Acids Res. 31(13), 2003
PMID: 12824339
Complete probabilistic analysis of RNA shapes.
Voss B, Giegerich R, Rehmsmeier M., BMC Biol. 4(), 2006
PMID: 16480488

hopcroft, Formal Languages and Their Relation to Automata (), 1969
Versatile and declarative dynamic programming using pair algebras.
Steffen P, Giegerich R., BMC Bioinformatics 6(), 2005
PMID: 16156887
Infernal 1.0: inference of RNA alignments.
Nawrocki EP, Kolbe DL, Eddy SR., Bioinformatics 25(10), 2009
PMID: 19307242

lab, INFERNAL User's Guide Sequence Analysis Using Profiles of RNA Secondary Structure (), 2009
Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®
Quellen

PMID: 21233528
PubMed | Europe PMC

Suchen in

Google Scholar