Benchmarking tools for the alignment of functional noncoding DNA

Pollard DA, Bergman CM, Stoye J, Celniker SE, Eisen MB (2004)
BMC Bioinformatics 5(1): 6.

Zeitschriftenaufsatz | Veröffentlicht | Englisch
 
Download
OA
Autor/in
; ; ; ;
Abstract / Bemerkung
Background: Numerous tools have been developed to align genomic sequences. However, their relative performance in specific applications remains poorly characterized. Alignments of protein-coding sequences typically have been benchmarked against "correct" alignments inferred from structural data. For noncoding sequences, where such independent validation is lacking, simulation provides an effective means to generate "correct" alignments with which to benchmark alignment tools. Results: Using rates of noncoding sequence evolution estimated from the genus Drosophila, we simulated alignments over a range of divergence times under varying models incorporating point substitution, insertion/deletion events, and short blocks of constrained sequences such as those found in cis-regulatory regions. We then compared "correct" alignments generated by a modified version of the ROSE simulation platform to alignments of the simulated derived sequences produced by eight pairwise alignment tools (Avid, BlastZ, Chaos, ClustalW, DiAlign, Lagan, Needle, and WABA) to determine the off-the-shelf performance of each tool. As expected, the ability to align noncoding sequences accurately decreases with increasing divergence for all tools, and declines faster in the presence of insertion/deletion evolution. Global alignment tools (Avid, ClustalW, Lagan, and Needle) typically have higher sensitivity over entire noncoding sequences as well as in constrained sequences. Local tools (BlastZ, Chaos, and WABA) have lower overall sensitivity as a consequence of incomplete coverage, but have high specificity to detect constrained sequences as well as high sensitivity within the subset of sequences they align. Tools such as DiAlign, which generate both local and global outputs, produce alignments of constrained sequences with both high sensitivity and specificity for divergence distances in the range of 1.25 - 3.0 substitutions per site. Conclusion: For species with genomic properties similar to Drosophila, we conclude that a single pair of optimally diverged species analyzed with a high performance alignment tool can yield accurate and specific alignments of functionally constrained noncoding sequences. Further algorithm development, optimization of alignment parameters, and benchmarking studies will be necessary to extract the maximal biological information from alignments of functional noncoding DNA.
Erscheinungsjahr
2004
Zeitschriftentitel
BMC Bioinformatics
Band
5
Ausgabe
1
Seite(n)
6
ISSN
1471-2105
Page URI
https://pub.uni-bielefeld.de/record/1773312

Zitieren

Pollard DA, Bergman CM, Stoye J, Celniker SE, Eisen MB. Benchmarking tools for the alignment of functional noncoding DNA. BMC Bioinformatics. 2004;5(1):6.
Pollard, D. A., Bergman, C. M., Stoye, J., Celniker, S. E., & Eisen, M. B. (2004). Benchmarking tools for the alignment of functional noncoding DNA. BMC Bioinformatics, 5(1), 6. doi:10.1186/1471-2105-5-6
Pollard, D. A., Bergman, C. M., Stoye, J., Celniker, S. E., and Eisen, M. B. (2004). Benchmarking tools for the alignment of functional noncoding DNA. BMC Bioinformatics 5, 6.
Pollard, D.A., et al., 2004. Benchmarking tools for the alignment of functional noncoding DNA. BMC Bioinformatics, 5(1), p 6.
D.A. Pollard, et al., “Benchmarking tools for the alignment of functional noncoding DNA”, BMC Bioinformatics, vol. 5, 2004, pp. 6.
Pollard, D.A., Bergman, C.M., Stoye, J., Celniker, S.E., Eisen, M.B.: Benchmarking tools for the alignment of functional noncoding DNA. BMC Bioinformatics. 5, 6 (2004).
Pollard, Daniel A., Bergman, Casey M., Stoye, Jens, Celniker, Susan E., and Eisen, Michael B. “Benchmarking tools for the alignment of functional noncoding DNA”. BMC Bioinformatics 5.1 (2004): 6.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Volltext(e)
Access Level
OA Open Access
Zuletzt Hochgeladen
2019-09-06T08:48:07Z
MD5 Prüfsumme
1b4340c1a2b57cc6231ff754938245a4

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®

Quellen

PMID: 14736341
PubMed | Europe PMC

Suchen in

Google Scholar