Quality control for terms and definitions in ontologies and taxonomies

Köhler J, Munn K, Rüegg A, Skusa A, Smith B (2006)
BMC Bioinformatics 7(1): 212.

Journal Article | Published | English
; ; ; ;
Background: Ontologies and taxonomies are among the most important computational resources for molecular biology and bioinformatics. A series of recent papers has shown that the Gene Ontology ( GO), the most prominent taxonomic resource in these fields, is marked by flaws of certain characteristic types, which flow from a failure to address basic ontological principles. As yet, no methods have been proposed which would allow ontology curators to pinpoint flawed terms or definitions in ontologies in a systematic way. Results: We present computational methods that automatically identify terms and definitions which are defined in a circular or unintelligible way. We further demonstrate the potential of these methods by applying them to isolate a subset of 6001 problematic GO terms. By automatically aligning GO with other ontologies and taxonomies we were able to propose alternative synonyms and definitions for some of these problematic terms. This allows us to demonstrate that these other resources do not contain definitions superior to those supplied by GO. Conclusion: Our methods provide reliable indications of the quality of terms and definitions in ontologies and taxonomies. Further, they are well suited to assist ontology curators in drawing their attention to those terms that are ill-defined. We have further shown the limitations of ontology mapping and alignment in assisting ontology curators in rectifying problems, thus pointing to the need for manual curation.
Publishing Year

Cite this

Köhler J, Munn K, Rüegg A, Skusa A, Smith B. Quality control for terms and definitions in ontologies and taxonomies. BMC Bioinformatics. 2006;7(1):212.
Köhler, J., Munn, K., Rüegg, A., Skusa, A., & Smith, B. (2006). Quality control for terms and definitions in ontologies and taxonomies. BMC Bioinformatics, 7(1), 212. doi:10.1186/1471-2105-7-212
Köhler, J., Munn, K., Rüegg, A., Skusa, A., and Smith, B. (2006). Quality control for terms and definitions in ontologies and taxonomies. BMC Bioinformatics 7, 212.
Köhler, J., et al., 2006. Quality control for terms and definitions in ontologies and taxonomies. BMC Bioinformatics, 7(1), p 212.
J. Köhler, et al., “Quality control for terms and definitions in ontologies and taxonomies”, BMC Bioinformatics, vol. 7, 2006, pp. 212.
Köhler, J., Munn, K., Rüegg, A., Skusa, A., Smith, B.: Quality control for terms and definitions in ontologies and taxonomies. BMC Bioinformatics. 7, 212 (2006).
Köhler, Jacob, Munn, Katherine, Rüegg, Alexander, Skusa, Andre, and Smith, Barry. “Quality control for terms and definitions in ontologies and taxonomies”. BMC Bioinformatics 7.1 (2006): 212.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Main File(s)
Access Level
OA Open Access

This data publication is cited in the following publications:
This publication cites the following data publications:

15 Citations in Europe PMC

Data provided by Europe PubMed Central.

Relating Complexity and Error Rates of Ontology Concepts. More Complex NCIt Concepts Have More Errors.
Min H, Zheng L, Perl Y, Halper M, De Coronado S, Ochs C., Methods Inf Med 56(3), 2017
PMID: 28244549
Easy Extraction of Terms and Definitions with OWL2TL.
Judkins J, Utecht J, Brochhausen M., CEUR Workshop Proc 1747(), 2016
PMID: 28035214
Towards natural language question generation for the validation of ontologies and mappings.
Ben Abacha A, Dos Reis JC, Mrabet Y, Pruski C, Da Silveira M., J Biomed Semantics 7(1), 2016
PMID: 27502477
Measuring the evolution of ontology complexity: the gene ontology case study.
Dameron O, Bettembourg C, Le Meur N., PLoS ONE 8(10), 2013
PMID: 24146805
A UML profile for the OBO relation ontology.
Guardia GD, Vencio RZ, de Farias CR., BMC Genomics 13 Suppl 5(), 2012
PMID: 23095840
Saliva Ontology: an ontology-based framework for a Salivaomics Knowledge Base.
Ai J, Smith B, Wong DT., BMC Bioinformatics 11(), 2010
PMID: 20525291
Two Is Not Always Better Than One: A Critical Evaluation of Two-System Theories.
Keren G, Schul Y., Perspect Psychol Sci 4(6), 2009
PMID: 26161732
Data integration for plant genomics--exemplars from the integration of Arabidopsis thaliana databases.
Lysenko A, Lysenko A, Hindle MM, Taubert J, Saqi M, Rawlings CJ., Brief. Bioinformatics 10(6), 2009
PMID: 19933213
Structural group-based auditing of missing hierarchical relationships in UMLS.
Chen Y, Gu HH, Perl Y, Geller J., J Biomed Inform 42(3), 2009
PMID: 18824248
Structural group auditing of a UMLS semantic type's extent.
Chen Y, Gu HH, Perl Y, Geller J, Halper M., J Biomed Inform 42(1), 2009
PMID: 18619563
Experiences mapping a legacy interface terminology to SNOMED CT.
Wade G, Rosenbloom ST., BMC Med Inform Decis Mak 8 Suppl 1(), 2008
PMID: 19007440
Automated comparative auditing of NCIT genomic roles using NCBI.
Cohen B, Oren M, Min H, Perl Y, Halper M., J Biomed Inform 41(6), 2008
PMID: 18486558
Protein microarray platforms for clinical proteomics.
Pollard HB, Srivastava M, Eidelman O, Jozwik C, Rothwell SW, Mueller GP, Jacobowitz DM, Darling T, Guggino WB, Wright J, Zeitlin PL, Paweletz CP., 2007
PMID: c1879
Topological analysis of large-scale biomedical terminology structures.
Bales ME, Lussier YA, Johnson SB., J Am Med Inform Assoc 14(6), 2007
PMID: 17712094
Bio-ontologies: current trends and future directions.
Bodenreider O, Stevens R., Brief. Bioinformatics 7(3), 2006
PMID: 16899495

46 References

Data provided by Europe PubMed Central.

Parsia B, Sirin E, Kalyanpur A., 2005
Consistency Checking of Semantic Web Ontologies
Baclawski K, Kokar MM, Waldinger RJ, Kogut PA., 2002
Law and order: Assessing and enforcing compliance with ontological modeling principles in the Foundational Model of Anatomy
Zhang S, Bodenreider O., 2005
Knowledge acquisition, consistency checking and concurrency control for Gene Ontology (GO).
Yeh I, Karp PD, Noy NF, Altman RB., Bioinformatics 19(2), 2003
PMID: 12538245
ONTOMETRIC: A Method to Choose the Appropriate Ontology
Lozano-Tello A, Gomez-Perez A., 2004

Haldar A, Mahadevan S., 2000

Copi IM, Cohen C., 2004
GO Editorial Guide
The role of definitions in biomedical concept representation.
Michael J, Mejino JL Jr, Rosse C., Proc AMIA Symp (), 2001
PMID: 11825231
MetaCyc: a multiorganism database of metabolic pathways and enzymes.
Krieger CJ, Zhang P, Mueller LA, Wang A, Paley S, Arnaud M, Pick J, Rhee SY, Karp PD., Nucleic Acids Res. 32(Database issue), 2004
PMID: 14681452
WordNet : an electronic lexical database
Fellbaum C., 1998
Medical Subject Headings (MeSH).
Lipscomb CE., Bull Med Libr Assoc 88(3), 2000
PMID: 10928714

Automatic ontology construction from the literature.
Blaschke C, Valencia A., Genome Inform 13(), 2002
PMID: 14571389

Sanderson M, Croft WB., 1999
Building mouse phenotype ontologies.
Gkoutos GV, Green EC, Mallon AM, Hancock JM, Davidson D., Pac Symp Biocomput (), 2004
PMID: 14992502
IR and AI: Using Co-Occurrence Theory to Generate Lightweight Ontologies: September 03 - 07 2001; Munich, Germany.
Ding Y., 2001
Linking experimental results, biological networks and sequence analysis methods using Ontologies and Generalized Data Structures
Köhler J, Rawlings C, Verrier P, Mitchell R, Skusa A, Ruegg A, Philippi S., 2004
Mappings of External Classification Systems to GO
Graph-based analysis and visualization of experimental results with ONDEX.
Kohler J, Baumbach J, Taubert J, Specht M, Skusa A, Ruegg A, Rawlings C, Verrier P, Philippi S., Bioinformatics 22(11), 2006
PMID: 16533819


0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®


PMID: 16623942
PubMed | Europe PMC

Search this title in

Google Scholar