Conveyor: a workflow engine for bioinformatic analyses

Linke B, Giegerich R, Goesmann A (2011)
Bioinformatics 27(7): 903-911.

Journal Article | Published | English

No fulltext has been uploaded

Abstract
Motivation: The rapidly increasing amounts of data available from new high-throughput methods have made data processing without automated pipelines infeasible. As was pointed out in several publications, integration of data and analytic resources into workflow systems provides a solution to this problem, simplifying the task of data analysis. Various applications for defining and running workflows in the field of bioinformatics have been proposed and published, e. g. Galaxy, Mobyle, Taverna, Pegasus or Kepler. One of the main aims of such workflow systems is to enable scientists to focus on analysing their datasets instead of taking care for data management, job management or monitoring the execution of computational tasks. The currently available workflow systems achieve this goal, but fundamentally differ in their way of executing workflows. Results: We have developed the Conveyor software library, a multitiered generic workflow engine for composition, execution and monitoring of complex workflows. It features an open, extensible system architecture and concurrent program execution to exploit resources available on modern multicore CPU hardware. It offers the ability to build complex workflows with branches, loops and other control structures. Two example use cases illustrate the application of the versatile Conveyor engine to common bioinformatics problems.
Publishing Year
ISSN
eISSN
PUB-ID

Cite this

Linke B, Giegerich R, Goesmann A. Conveyor: a workflow engine for bioinformatic analyses. Bioinformatics. 2011;27(7):903-911.
Linke, B., Giegerich, R., & Goesmann, A. (2011). Conveyor: a workflow engine for bioinformatic analyses. Bioinformatics, 27(7), 903-911.
Linke, B., Giegerich, R., and Goesmann, A. (2011). Conveyor: a workflow engine for bioinformatic analyses. Bioinformatics 27, 903-911.
Linke, B., Giegerich, R., & Goesmann, A., 2011. Conveyor: a workflow engine for bioinformatic analyses. Bioinformatics, 27(7), p 903-911.
B. Linke, R. Giegerich, and A. Goesmann, “Conveyor: a workflow engine for bioinformatic analyses”, Bioinformatics, vol. 27, 2011, pp. 903-911.
Linke, B., Giegerich, R., Goesmann, A.: Conveyor: a workflow engine for bioinformatic analyses. Bioinformatics. 27, 903-911 (2011).
Linke, Burkhard, Giegerich, Robert, and Goesmann, Alexander. “Conveyor: a workflow engine for bioinformatic analyses”. Bioinformatics 27.7 (2011): 903-911.
This data publication is cited in the following publications:
This publication cites the following data publications:

9 Citations in Europe PMC

Data provided by Europe PubMed Central.

Experiences with workflows for automating data-intensive bioinformatics.
Spjuth O, Bongcam-Rudloff E, Hernandez GC, Forer L, Giovacchini M, Guimera RV, Kallio A, Korpelainen E, Kandula MM, Krachunov M, Kreil DP, Kulev O, Labaj PP, Lampa S, Pireddu L, Schonherr S, Siretskiy A, Vassilev D., Biol. Direct 10(), 2015
PMID: 26282399
Bioinformatic pipelines in Python with Leaf.
Napolitano F, Mariani-Costantini R, Tagliaferri R., BMC Bioinformatics 14(), 2013
PMID: 23786315
Streaming support for data intensive cloud-based sequence analysis.
Issa SA, Kienzler R, El-Kalioby M, Tonellato PJ, Wall D, Bruggmann R, Abouelhoda M., Biomed Res Int 2013(), 2013
PMID: 23710461
Personalized cloud-based bioinformatics services for research and education: use cases and the elasticHPC package.
El-Kalioby M, Abouelhoda M, Kruger J, Giegerich R, Sczyrba A, Wall DP, Tonellato P., BMC Bioinformatics 13 Suppl 17(), 2012
PMID: 23281941
The Wasp System: an open source environment for managing and analyzing genomic data.
McLellan AS, Dubin RA, Jing Q, Broin PO, Moskowitz D, Suzuki M, Calder RB, Hargitai J, Golden A, Greally JM., Genomics 100(6), 2012
PMID: 22944616
Tavaxy: integrating Taverna and Galaxy workflows with cloud computing support.
Abouelhoda M, Issa SA, Ghanem M., BMC Bioinformatics 13(), 2012
PMID: 22559942
Extending KNIME for next-generation sequencing data analysis.
Jagla B, Wiswedel B, Coppee JY., Bioinformatics 27(20), 2011
PMID: 21873641

17 References

Data provided by Europe PubMed Central.


AUTHOR UNKNOWN, scientific program j 13(), 2005
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ., Nucleic Acids Res. 25(17), 1997
PMID: 9254694
Improved microbial gene identification with GLIMMER.
Delcher AL, Harmon D, Kasif S, White O, Salzberg SL., Nucleic Acids Res. 27(23), 1999
PMID: 10556321
XML schemas for common bioinformatic data types and their application in workflow systems.
Seibel PN, Kruger J, Hartmeier S, Schwarzer K, Lowenthal K, Mersch H, Dandekar T, Giegerich R., BMC Bioinformatics 7(), 2006
PMID: 17087823
Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.
Goecks J, Nekrutenko A, Taylor J; Galaxy Team, Afgan E, Ananda G, Baker D, Blankenberg D, Chakrabarty R, Coraor N, Goecks J, Von Kuster G, Lazarus R, Li K, Nekrutenko A, Taylor J, Vincent K., Genome Biol. 11(8), 2010
PMID: 20738864
BioXSD: the common data-exchange format for everyday bioinformatics web services.
Kalas M, Puntervoll P, Joseph A, Bartaseviciute E, Topfer A, Venkataraman P, Pettifer S, Bryne JC, Ison J, Blanchet C, Rapacki K, Jonassen I., Bioinformatics 26(18), 2010
PMID: 20823319
Ruffus: a lightweight Python library for computational pipelines.
Goodstadt L., Bioinformatics 26(21), 2010
PMID: 20847218
Taverna: a tool for building and running workflows of services.
Hull D, Wolstencroft K, Stevens R, Goble C, Pocock MR, Li P, Oinn T., Nucleic Acids Res. 34(Web Server issue), 2006
PMID: 16845108
Interoperability with Moby 1.0--it's better than sharing your toothbrush!
BioMoby Consortium, Wilkinson MD, Senger M, Kawas E, Bruskiewich R, Gouzy J, Noirot C, Bardou P, Ng A, Haase D, Saiz Ede A, Wang D, Gibbons F, Gordon PM, Sensen CW, Carrasco JM, Fernandez JM, Shen L, Links M, Ng M, Opushneva N, Neerincx PB, Leunissen JA, Ernst R, Twigger S, Usadel B, Good B, Wong Y, Stein L, Crosby W, Karlsson J, Royo R, Parraga I, Ramirez S, Gelpi JL, Trelles O, Pisano DG, Jimenez N, Kerhornou A, Rosset R, Zamacola L, Tarraga J, Huerta-Cepas J, Carazo JM, Dopazo J, Guigo R, Navarro A, Orozco M, Valencia A, Claros MG, Perez AJ, Aldana J, Rojano M, Fernandez-Santa Cruz R, Navas I, Schiltz G, Farmer A, Gessler D, Schoof H, Groscurth A., Brief. Bioinformatics 9(3), 2008
PMID: 18238804
Solutions for data integration in functional genomics: a critical assessment and case study.
Smedley D, Swertz MA, Wolstencroft K, Proctor G, Zouberakis M, Bard J, Hancock JM, Schofield P., Brief. Bioinformatics 9(6), 2008
PMID: 19112082
The Sequence Alignment/Map format and SAMtools.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup., Bioinformatics 25(16), 2009
PMID: 19505943
Mobyle: a new full web bioinformatics framework.
Neron B, Menager H, Maufrais C, Joly N, Maupetit J, Letort S, Carrere S, Tuffery P, Letondal C., Bioinformatics 25(22), 2009
PMID: 19689959
The Universal Protein Resource (UniProt) in 2010
The, Nucleic Acids Research 38(database), 2010
GenBank.
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW., Nucleic Acids Res. 38(Database issue), 2010
PMID: 19910366

Export

0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®

Sources

PMID: 21278189
PubMed | Europe PMC

Search this title in

Google Scholar