An overview of the wcd EST clustering tool

Hazelhurst, Scott; Hide, Winston; Lipták, Zsuzsanna; Nogueira, Ramon; Starfield, Richard

An overview of the wcd EST clustering tool

Hazelhurst S, Hide W, Lipták Z, Nogueira R, Starfield R (2008)
BIOINFORMATICS 24(13): 1542-1546.

Zeitschriftenaufsatz | Veröffentlicht | Englisch

Download

Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!

DOI

https://doi.org/10.1093/bioinformatics/btn203

Autor*in

Hazelhurst, Scott; Hide, Winston; Lipták, Zsuzsanna^UniBi; Nogueira, Ramon; Starfield, Richard

Einrichtung

Technische Fakultät > AG Genominformatik
Centrum für Biotechnologie > Institut für Bioinformatik

Abstract / Bemerkung

The wcd system is an open source tool for clustering expressed sequence tags (EST) and other DNA and RNA sequences. wcd allows efficient all-versus-all comparison of ESTs using either the d(2) distance function or edit distance, improving existing implementations of d(2). It supports merging, refinement and reclustering of clusters. It is drop in compatible with the StackPack clustering package. wcd supports parallelization under both shared memory and cluster architectures. It is distributed with an EMBOSS wrapper allowing wcd to be installed as part of an EMBOSS installation (and so provided by a web server).

Erscheinungsjahr

2008

Zeitschriftentitel

BIOINFORMATICS

Band

Ausgabe

Seite(n)

1542-1546

ISSN

1367-4803

eISSN

1460-2059

Page URI

https://pub.uni-bielefeld.de/record/1587304

Zitieren

Hazelhurst S, Hide W, Lipták Z, Nogueira R, Starfield R. An overview of the wcd EST clustering tool. BIOINFORMATICS. 2008;24(13):1542-1546.

Hazelhurst, S., Hide, W., Lipták, Z., Nogueira, R., & Starfield, R. (2008). An overview of the wcd EST clustering tool. BIOINFORMATICS, 24(13), 1542-1546. https://doi.org/10.1093/bioinformatics/btn203

Hazelhurst, Scott, Hide, Winston, Lipták, Zsuzsanna, Nogueira, Ramon, and Starfield, Richard. 2008. “An overview of the wcd EST clustering tool”. BIOINFORMATICS 24 (13): 1542-1546.

Hazelhurst, S., Hide, W., Lipták, Z., Nogueira, R., and Starfield, R. (2008). An overview of the wcd EST clustering tool. BIOINFORMATICS 24, 1542-1546.

Hazelhurst, S., et al., 2008. An overview of the wcd EST clustering tool. BIOINFORMATICS, 24(13), p 1542-1546.

S. Hazelhurst, et al., “An overview of the wcd EST clustering tool”, BIOINFORMATICS, vol. 24, 2008, pp. 1542-1546.

Hazelhurst, S., Hide, W., Lipták, Z., Nogueira, R., Starfield, R.: An overview of the wcd EST clustering tool. BIOINFORMATICS. 24, 1542-1546 (2008).

Hazelhurst, Scott, Hide, Winston, Lipták, Zsuzsanna, Nogueira, Ramon, and Starfield, Richard. “An overview of the wcd EST clustering tool”. BIOINFORMATICS 24.13 (2008): 1542-1546.

Daten bereitgestellt von European Bioinformatics Institute (EBI)

19 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

MeShClust: an intelligent tool for clustering DNA sequences.
James BT, Luczak BB, Girgis HZ., Nucleic Acids Res 46(14), 2018
PMID: 29718317

Inferring bona fide transfrags in RNA-Seq derived-transcriptome assemblies of non-model organisms.
Mbandi SK, Hesse U, van Heusden P, Christoffels A., BMC Bioinformatics 16(), 2015
PMID: 25880035

Construction of a public CHO cell line transcript database using versatile bioinformatics analysis pipelines.
Rupp O, Becker J, Brinkrolf K, Timmermann C, Borth N, Pühler A, Noll T, Goesmann A., PLoS One 9(1), 2014
PMID: 24427317

Development of EST-based SNP and InDel markers and their utilization in tetraploid cotton genetic mapping.
Li X, Gao W, Guo H, Zhang X, Fang DD, Lin Z., BMC Genomics 15(), 2014
PMID: 25442170

EasyCluster2: an improved tool for clustering and assembling long transcriptome reads.
Bevilacqua V, Pietroleonardo N, Giannino E, Stroppa F, Simone D, Pesole G, Picardi E., BMC Bioinformatics 15 Suppl 15(), 2014
PMID: 25474441

Evolution of saxitoxin synthesis in cyanobacteria and dinoflagellates.
Hackett JD, Wisecaver JH, Brosnahan ML, Kulis DM, Anderson DM, Bhattacharya D, Plumley FG, Erdner DL., Mol Biol Evol 30(1), 2013
PMID: 22628533

Analysis of the leaf transcriptome of Musa acuminata during interaction with Mycosphaerella musicola: gene assembly, annotation and marker development.
Passos MA, de Cruz VO, Emediato FL, de Teixeira CC, Azevedo VC, Brasileiro AC, Amorim EP, Ferreira CF, Martins NF, Togawa RC, Júnior GJ, da Silva OB, Miller RN., BMC Genomics 14(), 2013
PMID: 23379821

A de novo assembly of the newt transcriptome combined with proteomic validation identifies new protein families expressed during tissue regeneration.
Looso M, Preussner J, Sousounis K, Bruckskotten M, Michel CS, Lignelli E, Reinhardt R, Höffner S, Krüger M, Tsonis PA, Borchardt T, Braun T., Genome Biol 14(2), 2013
PMID: 23425577

A hybrid distance measure for clustering expressed sequence tags originating from the same gene family.
Ng KH, Ho CK, Phon-Amnuaisuk S., PLoS One 7(10), 2012
PMID: 23071763

Bio-crude transcriptomics: gene discovery and metabolic network reconstruction for the biosynthesis of the terpenome of the hydrocarbon oil-producing green alga, Botryococcus braunii race B (Showa).
Molnár I, Lopez D, Wisecaver JH, Devarenne TP, Weiss TL, Pellegrini M, Hackett JD., BMC Genomics 13(), 2012
PMID: 23110428

High-throughput SNP genotyping in the highly heterozygous genome of Eucalyptus: assay success, polymorphism and transferability across species.
Grattapaglia D, Silva-Junior OB, Kirst M, de Lima BM, Faria DA, Pappas GJ., BMC Plant Biol 11(), 2011
PMID: 21492434

Revealing impaired pathways in the an11 mutant by high-throughput characterization of Petunia axillaris and Petunia inflata transcriptomes.
Zenoni S, D'Agostino N, Tornielli GB, Quattrocchio F, Chiusano ML, Koes R, Zethof J, Guzzo F, Delledonne M, Frusciante L, Gerats T, Pezzotti M., Plant J 68(1), 2011
PMID: 21623977

SEED: efficient clustering of next-generation sequences.
Bao E, Jiang T, Kaloshian I, Girke T., Bioinformatics 27(18), 2011
PMID: 21810899

KABOOM! A new suffix array based algorithm for clustering expression data.
Hazelhurst S, Lipták Z., Bioinformatics 27(24), 2011
PMID: 21984769

Clustering algorithms in biomedical research: a review.
Xu R, Wunsch DC., IEEE Rev Biomed Eng 3(), 2010
PMID: 22275205

PEACE: Parallel Environment for Assembly and Clustering of Gene Expression.
Rao DM, Moler JC, Ozden M, Zhang Y, Liang C, Karro JE., Nucleic Acids Res 38(web server issue), 2010
PMID: 20522511

EasyCluster: a fast and efficient gene-oriented clustering tool for large-scale transcriptome data.
Picardi E, Mignone F, Pesole G., BMC Bioinformatics 10 Suppl 6(), 2009
PMID: 19534735

k-link EST clustering: evaluating error introduced by chimeric sequences under different degrees of linkage.
Bragg LM, Stone G., Bioinformatics 25(18), 2009
PMID: 19570806

SolEST database: a "one-stop shop" approach to the study of Solanaceae transcriptomes.
D'Agostino N, Traini A, Frusciante L, Chiusano ML., BMC Plant Biol 9(), 2009
PMID: 19948013

11 References

Daten bereitgestellt von Europe PubMed Central.

Algorithms for clustering expressed sequence tags: the tool
Hazelhurst S., 2008

ESTsim: a tool for creating benchmarks for EST clustering algorithms
Hazelhurst S, Bergheim A., 2003

Biological evaluation of d2, an algorithm for high-performance sequence comparison.
Hide W, Burke J, Davison DB., J. Comput. Biol. 1(3), 1994
PMID: 8790465

CAP3: A DNA sequence assembly program.
Huang X, Madan A., Genome Res. 9(9), 1999
PMID: 10508846

Integrative annotation of 21,037 human genes validated by full-length cDNA clones.
Imanishi T, Itoh T, Suzuki Y, O'Donovan C, Fukuchi S, Koyanagi KO, Barrero RA, Tamura T, Yamaguchi-Kabata Y, Tanino M, Yura K, Miyazaki S, Ikeo K, Homma K, Kasprzyk A, Nishikawa T, Hirakawa M, Thierry-Mieg J, Thierry-Mieg D, Ashurst J, Jia L, Nakao M, Thomas MA, Mulder N, Karavidopoulou Y, Jin L, Kim S, Yasuda T, Lenhard B, Eveno E, Suzuki Y, Yamasaki C, Takeda J, Gough C, Hilton P, Fujii Y, Sakai H, Tanaka S, Amid C, Bellgard M, Bonaldo Mde F, Bono H, Bromberg SK, Brookes AJ, Bruford E, Carninci P, Chelala C, Couillault C, de Souza SJ, Debily MA, Devignes MD, Dubchak I, Endo T, Estreicher A, Eyras E, Fukami-Kobayashi K, Gopinath GR, Graudens E, Hahn Y, Han M, Han ZG, Hanada K, Hanaoka H, Harada E, Hashimoto K, Hinz U, Hirai M, Hishiki T, Hopkinson I, Imbeaud S, Inoko H, Kanapin A, Kaneko Y, Kasukawa T, Kelso J, Kersey P, Kikuno R, Kimura K, Korn B, Kuryshev V, Makalowska I, Makino T, Mano S, Mariage-Samson R, Mashima J, Matsuda H, Mewes HW, Minoshima S, Nagai K, Nagasaki H, Nagata N, Nigam R, Ogasawara O, Ohara O, Ohtsubo M, Okada N, Okido T, Oota S, Ota M, Ota T, Otsuki T, Piatier-Tonneau D, Poustka A, Ren SX, Saitou N, Sakai K, Sakamoto S, Sakate R, Schupp I, Servant F, Sherry S, Shiba R, Shimizu N, Shimoyama M, Simpson AJ, Soares B, Steward C, Suwa M, Suzuki M, Takahashi A, Tamiya G, Tanaka H, Taylor T, Terwilliger JD, Unneberg P, Veeramachaneni V, Watanabe S, Wilming L, Yasuda N, Yoo HS, Stodolsky M, Makalowski W, Go M, Nakai K, Takagi T, Kanehisa M, Sakaki Y, Quackenbush J, Okazaki Y, Hayashizaki Y, Hide W, Chakraborty R, Nishikawa K, Sugawara H, Tateno Y, Chen Z, Oishi M, Tonellato P, Apweiler R, Okubo K, Wagner L, Wiemann S, Strausberg RL, Isogai T, Auffray C, Nomura N, Gojobori T, Sugano S., PLoS Biol. 2(6), 2004
PMID: 15103394

Space and time efficient parallel algorithms and software for EST clustering
Kalyanaraman A., 2003

Fast sequence clustering using a suffix array algorithm.
Malde K, Coward E, Jonassen I., Bioinformatics 19(10), 2003
PMID: 12835265

A comprehensive approach to clustering of expressed human gene sequence: the sequence tag alignment and consensus knowledge base.
Miller RT, Christoffels AG, Gopalakrishnan C, Burke J, Ptitsyn AA, Broveak TR, Hide WA., Genome Res. 9(11), 1999
PMID: 10568754

A hitchhiker's guide to expressed sequence tag (EST) analysis.
Nagaraj SH, Gasser RB, Ranganathan S., Brief. Bioinformatics 8(1), 2006
PMID: 16772268

StackPACK clustering system
Reed G., 2001

Algorithms for the Analysis of Expressed Sequence Tags
Slater G., 2000

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB