The SYSTERS protein sequence cluster set

Krause A, Stoye J, Vingron M (2000)
Nucleic Acids Research 28(1): 270-272.

Zeitschriftenaufsatz | Veröffentlicht | Englisch
Krause, Antje; Stoye, JensUniBi ; Vingron, Martin
Abstract / Bemerkung
The SYSTERS (short for SYSTEmatic Re-Searching) protein sequence cluster set consists of the classification of all sequences from SWISS-PROT and PIR into disjoint protein family clusters and hierarchically into superfamily and subfamily clusters. The cluster set can be searched with a sequence using the SSMAL search tool or a traditional database search tool like BLAST or FASTA. Additionally a multiple alignment is generated for each cluster and annotated with domain information from the Pfam database of protein domain families. A taxonomic overview of the organisms covered by a cluster is given based on the NCBI taxonomy. The cluster set is available for querying and browsing at
Nucleic Acids Research
Page URI


Krause A, Stoye J, Vingron M. The SYSTERS protein sequence cluster set. Nucleic Acids Research. 2000;28(1):270-272.
Krause, A., Stoye, J., & Vingron, M. (2000). The SYSTERS protein sequence cluster set. Nucleic Acids Research, 28(1), 270-272.
Krause, Antje, Stoye, Jens, and Vingron, Martin. 2000. “The SYSTERS protein sequence cluster set”. Nucleic Acids Research 28 (1): 270-272.
Krause, A., Stoye, J., and Vingron, M. (2000). The SYSTERS protein sequence cluster set. Nucleic Acids Research 28, 270-272.
Krause, A., Stoye, J., & Vingron, M., 2000. The SYSTERS protein sequence cluster set. Nucleic Acids Research, 28(1), p 270-272.
A. Krause, J. Stoye, and M. Vingron, “The SYSTERS protein sequence cluster set”, Nucleic Acids Research, vol. 28, 2000, pp. 270-272.
Krause, A., Stoye, J., Vingron, M.: The SYSTERS protein sequence cluster set. Nucleic Acids Research. 28, 270-272 (2000).
Krause, Antje, Stoye, Jens, and Vingron, Martin. “The SYSTERS protein sequence cluster set”. Nucleic Acids Research 28.1 (2000): 270-272.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Access Level
OA Open Access
Zuletzt Hochgeladen
MD5 Prüfsumme

40 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

Chitinase from Thermomyces lanuginosus SSBP and its biotechnological applications.
Khan FI, Bisetty K, Singh S, Permaul K, Hassan MI., Extremophiles 19(6), 2015
PMID: 26462798
Optimizing high performance computing workflow for protein functional annotation.
Stanberry L, Rekepalli B, Liu Y, Giblock P, Higdon R, Montague E, Broomall W, Kolker N, Kolker E., Concurr Comput 26(13), 2014
PMID: 25313296
Unraveling the Complexities of Life Sciences Data.
Higdon R, Haynes W, Stanberry L, Stewart E, Yandl G, Howard C, Broomall W, Kolker N, Kolker E., Big Data 1(1), 2013
PMID: 27447037
kClust: fast and sensitive clustering of large protein sequence databases.
Hauser M, Mayer CE, Söding J., BMC Bioinformatics 14(), 2013
PMID: 23945046
Structural SCOP superfamily level classification using unsupervised machine learning.
Angadi UB, Venkatesulu M., IEEE/ACM Trans Comput Biol Bioinform 9(2), 2012
PMID: 21844638
BAR-PLUS: the Bologna Annotation Resource Plus for functional and structural annotation of protein sequences.
Piovesan D, Martelli PL, Fariselli P, Zauli A, Rossi I, Casadio R., Nucleic Acids Res 39(web server issue), 2011
PMID: 21622657
CLUSS: clustering of protein sequences based on a new similarity measure.
Kelil A, Wang S, Brzezinski R, Fleury A., BMC Bioinformatics 8(), 2007
PMID: 17683581
Exploiting protein structure data to explore the evolution of protein function and biological complexity.
Marsden RL, Ranea JA, Sillero A, Redfern O, Yeats C, Maibaum M, Lee D, Addou S, Reeves GA, Dallman TJ, Orengo CA., Philos Trans R Soc Lond B Biol Sci 361(1467), 2006
PMID: 16524831
Graph theoretical insights into evolution of multidomain proteins.
Przytycka T, Davis G, Song N, Durand D., J Comput Biol 13(2), 2006
PMID: 16597245
Spectral clustering of protein sequences.
Paccanaro A, Casbon JA, Saqi MA., Nucleic Acids Res 34(5), 2006
PMID: 16547200
The Escherichia coli proteome: past, present, and future prospects.
Han MJ, Lee SY., Microbiol Mol Biol Rev 70(2), 2006
PMID: 16760308
LPC cepstral distortion measure for protein sequence comparison.
Pham TD., IEEE Trans Nanobioscience 5(2), 2006
PMID: 16805103
DWARF--a data warehouse system for analyzing protein families.
Fischer M, Thai QK, Grieb M, Pleiss J., BMC Bioinformatics 7(), 2006
PMID: 17094801
SEQOPTICS: a protein sequence clustering system.
Chen Y, Reilly KD, Sprague AP, Guan Z., BMC Bioinformatics 7 Suppl 4(), 2006
PMID: 17217502
eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity.
Su QJ, Lu L, Saxonov S, Brutlag DL., Nucleic Acids Res 33(database issue), 2005
PMID: 15608172
On the quality of tree-based protein classification.
Lazareva-Ulitsky B, Diemer K, Thomas PD., Bioinformatics 21(9), 2005
PMID: 15647305
Identification and distribution of protein families in 120 completed genomes using Gene3D.
Lee D, Grant A, Marsden RL, Orengo C., Proteins 59(3), 2005
PMID: 15768405
Exploration of phylogenetic data using a global sequence analysis method.
Chapus C, Dufraigne C, Edwards S, Giron A, Fertil B, Deschavanne P., BMC Evol Biol 5(), 2005
PMID: 16280081
A hybrid clustering approach to recognition of protein families in 114 microbial genomes.
Harlow TJ, Gogarten JP, Ragan MA., BMC Bioinformatics 5(), 2004
PMID: 15115543
Recent developments in computational approaches for uncovering genomic homology.
Simillion C, Vandepoele K, Van de Peer Y., Bioessays 26(11), 2004
PMID: 15499578
Sequence-related human proteins cluster by degree of evolutionary conservation.
Mrowka R, Patzak A, Herzel H, Holste D., Phys Rev E Stat Nonlin Soft Matter Phys 70(5 pt 1), 2004
PMID: 15600657
A functional hierarchical organization of the protein sequence space.
Kaplan N, Friedlich M, Fromer M, Linial M., BMC Bioinformatics 5(), 2004
PMID: 15596019
ProtoNet: hierarchical classification of the protein space.
Sasson O, Vaaknin A, Fleischer H, Portugaly E, Bilu Y, Linial N, Linial M., Nucleic Acids Res 31(1), 2003
PMID: 12520020
Introduction to inferring evolutionary relationships.
Page RD., Curr Protoc Bioinformatics Chapter 6(), 2003
PMID: 18428703
Target selection and determination of function in structural genomics.
Watson JD, Todd AE, Bray J, Laskowski RA, Edwards A, Joachimiak A, Orengo CA, Thornton JM., IUBMB Life 55(4-5), 2003
PMID: 12880206
A structural perspective on genome evolution.
Lee D, Grant A, Buchan D, Orengo C., Curr Opin Struct Biol 13(3), 2003
PMID: 12831888
Diversity, taxonomy and evolution of medium-chain dehydrogenase/reductase superfamily.
Riveros-Rosas H, Julián-Sánchez A, Villalobos-Molina R, Pardo JP, Piña E., Eur J Biochem 270(16), 2003
PMID: 12899689
PANDORA: keyword-based analysis of protein sets by integration of annotation sources.
Kaplan N, Vaaknin A, Linial M., Nucleic Acids Res 31(19), 2003
PMID: 14500825
An integrated gene annotation and transcriptional profiling approach towards the full gene content of the Drosophila genome.
Hild M, Beckmann B, Haas SA, Koch B, Solovyev V, Busold C, Fellenberg K, Boutros M, Vingron M, Sauer F, Hoheisel JD, Paro R., Genome Biol 5(1), 2003
PMID: 14709175
Tools and resources for identifying protein families, domains and motifs.
Mulder NJ, Apweiler R., Genome Biol 3(1), 2002
PMID: 11806833
PASS2: a semi-automated database of protein alignments organised as structural superfamilies.
Mallika V, Bhaduri A, Sowdhamini R., Nucleic Acids Res 30(1), 2002
PMID: 11752316
Comparison of sequence and structure alignments for protein domains.
Marchler-Bauer A, Panchenko AR, Ariel N, Bryant SH., Proteins 48(3), 2002
PMID: 12112669
Clustering and analysis of protein families.
Kriventseva EV, Biswas M, Apweiler R., Curr Opin Struct Biol 11(3), 2001
PMID: 11406384
Secator: a program for inferring protein subfamilies from phylogenetic trees.
Wicker N, Perrin GR, Thierry JC, Poch O., Mol Biol Evol 18(8), 2001
PMID: 11470834
GeneNest: automated generation and visualization of gene indices.
Haas SA, Beissbarth T, Rivals E, Krause A, Vingron M., Trends Genet 16(11), 2000
PMID: 12199289

24 References

Daten bereitgestellt von Europe PubMed Central.

Basic local alignment search tool.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ., J. Mol. Biol. 215(3), 1990
PMID: 2231712
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ., Nucleic Acids Res. 25(17), 1997
PMID: 9254694
Improved tools for biological sequence comparison.
Pearson WR, Lipman DJ., Proc. Natl. Acad. Sci. U.S.A. 85(8), 1988
PMID: 3162770
A set-theoretic approach to database searching and clustering.
Krause A, Vingron M., Bioinformatics 14(5), 1998
PMID: 9682056
WWW access to the SYSTERS protein sequence cluster set.
Krause A, Nicodeme P, Bornberg-Bauer E, Rehmsmeier M, Vingron M., Bioinformatics 15(3), 1999
PMID: 10222416

Local alignment statistics.
Altschul SF, Gish W., Meth. Enzymol. 266(), 1996
PMID: 8743700




The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999.
Bairoch A, Apweiler R., Nucleic Acids Res. 27(1), 1999
PMID: 9847139
The PIR-International Protein Sequence Database.
Barker WC, Garavelli JS, McGarvey PB, Marzec CR, Orcutt BC, Srinivasarao GY, Yeh LS, Ledley RS, Mewes HW, Pfeiffer F, Tsugita A, Wu C., Nucleic Acids Res. 27(1), 1999
PMID: 9847137
The Protein Data Bank: a computer-based archival file for macromolecular structures.
Bernstein FC, Koetzle TF, Williams GJ, Meyer EF Jr, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M., J. Mol. Biol. 112(3), 1977
PMID: 875032
The ENZYME data bank in 1999.
Bairoch A., Nucleic Acids Res. 27(1), 1999
PMID: 9847212
The PROSITE database, its status in 1999.
Hofmann K, Bucher P, Falquet L, Bairoch A., Nucleic Acids Res. 27(1), 1999
PMID: 9847184
The EMBL Nucleotide Sequence Database.
Stoesser G, Tuli MA, Lopez R, Sterk P., Nucleic Acids Res. 27(1), 1999
PMID: 9847133
SSMAL: similarity searching with alignment graphs.
Nicodeme P., Bioinformatics 14(6), 1998
PMID: 9694989
Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins.
Bateman A, Birney E, Durbin R, Eddy SR, Finn RD, Sonnhammer EL., Nucleic Acids Res. 27(1), 1999
PMID: 9847196
Benson DA, Boguski MS, Lipman DJ, Ostell J, Ouellette BF, Rapp BA, Wheeler DL., Nucleic Acids Res. 27(1), 1999
PMID: 9847132
EUCLID: automatic classification of proteins in functional classes by their database annotations.
Tamames J, Ouzounis C, Casari G, Sander C, Valencia A., Bioinformatics 14(6), 1998
PMID: 9694995
MView: a web-compatible database search or multiple alignment viewer.
Brown NP, Leroy C, Sander C., Bioinformatics 14(4), 1998
PMID: 9632837

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®

PMID: 10592244
PubMed | Europe PMC

Suchen in

Google Scholar