Fast index based algorithms and software for matching position specific scoring matrices

Beckstette, Michael; Homann, Robert; Giegerich, Robert; Kurtz, Stefan

Fast index based algorithms and software for matching position specific scoring matrices

Beckstette M, Homann R, Giegerich R, Kurtz S (2006)
BMC Bioinformatics 7(1): 389.

Zeitschriftenaufsatz | Veröffentlicht | Englisch

Download

1471_2105_7_389.pdf

DOI

https://doi.org/10.1186/1471-2105-7-389

URN

urn:nbn:de:0070-pub-17738644

Autor*in

Beckstette, Michael^UniBi; Homann, Robert^UniBi; Giegerich, Robert^UniBi; Kurtz, Stefan

Einrichtung

Centrum für Biotechnologie > Arbeitsgruppe R. Giegerich
Technische Fakultät > AG Praktische Informatik
Technische Fakultät > BIBI
Technische Fakultät > AG Genominformatik
Centrum für Biotechnologie > Institut für Bioinformatik

Abstract / Bemerkung

Background: In biological sequence analysis, position specific scoring matrices (PSSMs) are widely used to represent sequence motifs in nucleotide as well as amino acid sequences. Searching with PSSMs in complete genomes or large sequence databases is a common, but computationally expensive task. Results: We present a new non-heuristic algorithm, called ESAsearch, to efficiently find matches of PSSMs in large databases. Our approach preprocesses the search space, e.g., a complete genome or a set of protein sequences, and builds an enhanced suffix array that is stored on file. This allows the searching of a database with a PSSM in sublinear expected time. Since ESAsearch benefits from small alphabets, we present a variant operating on sequences recoded according to a reduced alphabet. We also address the problem of non-comparable PSSM-scores by developing a method which allows the efficient computation of a matrix similarity threshold for a PSSM, given an E-value or a p-value. Our method is based on dynamic programming and, in contrast to other methods, it employs lazy evaluation of the dynamic programming matrix. We evaluated algorithm ESAsearch with nucleotide PSSMs and with amino acid PSSMs. Compared to the best previous methods, ESAsearch shows speedups of a factor between 17 and 275 for nucleotide PSSMs, and speedups up to factor 1.8 for amino acid PSSMs. Comparisons with the most widely used programs even show speedups by a factor of at least 3.8. Alphabet reduction yields an additional speedup factor of 2 on amino acid sequences compared to results achieved with the 20 symbol standard alphabet. The lazy evaluation method is also much faster than previous methods, with speedups of a factor between 3 and 330. Conclusion: Our analysis of ESAsearch reveals sublinear runtime in the expected case, and linear runtime in the worst case for sequences not shorter than |A|^m + m - 1, where m is the length of the PSSM and A a finite alphabet. In practice, ESAsearch shows superior performance over the most widely used programs, especially for DNA sequences. The new algorithm for accurate on-the-fly calculations of thresholds has the potential to replace formerly used approximation approaches. Beyond the algorithmic contributions, we provide a robust, well documented, and easy to use software package, implementing the ideas and algorithms presented in this manuscript.

Erscheinungsjahr

2006

Zeitschriftentitel

BMC Bioinformatics

Band

Ausgabe

Art.-Nr.

389

eISSN

1471-2105

Page URI

https://pub.uni-bielefeld.de/record/1773864

Zitieren

Beckstette M, Homann R, Giegerich R, Kurtz S. Fast index based algorithms and software for matching position specific scoring matrices. BMC Bioinformatics. 2006;7(1): 389.

Beckstette, M., Homann, R., Giegerich, R., & Kurtz, S. (2006). Fast index based algorithms and software for matching position specific scoring matrices. BMC Bioinformatics, 7(1), 389. https://doi.org/10.1186/1471-2105-7-389

Beckstette, Michael, Homann, Robert, Giegerich, Robert, and Kurtz, Stefan. 2006. “Fast index based algorithms and software for matching position specific scoring matrices”. BMC Bioinformatics 7 (1): 389.

Beckstette, M., Homann, R., Giegerich, R., and Kurtz, S. (2006). Fast index based algorithms and software for matching position specific scoring matrices. BMC Bioinformatics 7:389.

Beckstette, M., et al., 2006. Fast index based algorithms and software for matching position specific scoring matrices. BMC Bioinformatics, 7(1): 389.

M. Beckstette, et al., “Fast index based algorithms and software for matching position specific scoring matrices”, BMC Bioinformatics, vol. 7, 2006, : 389.

Beckstette, M., Homann, R., Giegerich, R., Kurtz, S.: Fast index based algorithms and software for matching position specific scoring matrices. BMC Bioinformatics. 7, : 389 (2006).

Beckstette, Michael, Homann, Robert, Giegerich, Robert, and Kurtz, Stefan. “Fast index based algorithms and software for matching position specific scoring matrices”. BMC Bioinformatics 7.1 (2006): 389.

Alle Dateien verfügbar unter der/den folgenden Lizenz(en):

Copyright Statement:

Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]

Volltext(e)

Name

1471_2105_7_389.pdf

Access Level

Open Access

Zuletzt Hochgeladen

2019-09-06T08:48:09Z

MD5 Prüfsumme

1467f0fda53a2d02ba2198ce31c557e0

Daten bereitgestellt von European Bioinformatics Institute (EBI)

66 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

Upstream analysis of alternative splicing: a review of computational approaches to predict context-dependent splicing factors.
Carazo F, Romero JP, Rubio A., Brief Bioinform (), 2018
PMID: 29390045

Fast sequence analysis based on diamond sampling.
Gao L, Bao W, Zhang H, Yuan CA, Huang DS., PLoS One 13(6), 2018
PMID: 29953448

PSSMSearch: a server for modeling, visualization, proteome-wide discovery and annotation of protein motif specificity determinants.
Krystkowiak I, Manguy J, Davey NE., Nucleic Acids Res 46(w1), 2018
PMID: 29873773

Dempster-Shafer Theory for the Prediction of Auxin-Response Elements (AuxREs) in Plant Genomes.
Sghaier N, Ben Ayed R, Ben Marzoug R, Rebai A., Biomed Res Int 2018(), 2018
PMID: 30515394

Improved genome assembly of American alligator genome reveals conserved architecture of estrogen signaling.
Rice ES, Kohno S, John JS, Pham S, Howard J, Lareau LF, O'Connell BL, Hickey G, Armstrong J, Deran A, Fiddes I, Platt RN, Gresham C, McCarthy F, Kern C, Haan D, Phan T, Schmidt C, Sanford JR, Ray DA, Paten B, Guillette LJ, Green RE., Genome Res 27(5), 2017
PMID: 28137821

Improving Gene Regulatory Network Inference by Incorporating Rates of Transcriptional Changes.
Desai JS, Sartor RC, Lawas LM, Jagadish SVK, Doherty CJ., Sci Rep 7(1), 2017
PMID: 29222512

Function and evolution of local repeats in the Firre locus.
Hacisuleyman E, Shukla CJ, Weiner CL, Rinn JL., Nat Commun 7(), 2016
PMID: 27009974

Parametric bootstrapping for biological sequence motifs.
O'Neill PK, Erill I., BMC Bioinformatics 17(1), 2016
PMID: 27716039

Identification and characterization of VpsR and VpsT binding sites in Vibrio cholerae.
Zamorano-Sánchez D, Fong JC, Kilic S, Erill I, Yildiz FH., J Bacteriol 197(7), 2015
PMID: 25622616

CMRegNet-An interspecies reference database for corynebacterial and mycobacterial regulatory networks.
Abreu VA, Almeida S, Tiwari S, Hassan SS, Mariano D, Silva A, Baumbach J, Azevedo V, Röttger R., BMC Genomics 16(), 2015
PMID: 26062809

Zinc regulates a key transcriptional pathway for epileptogenesis via metal-regulatory transcription factor 1.
van Loo KM, Schaub C, Pitsch J, Kulbida R, Opitz T, Ekstein D, Dalal A, Urbach H, Beck H, Yaari Y, Schoch S, Becker AJ., Nat Commun 6(), 2015
PMID: 26498180

Adaptable probabilistic mapping of short reads using position specific scoring matrices.
Kerpedjiev P, Frellsen J, Lindgreen S, Krogh A., BMC Bioinformatics 15(), 2014
PMID: 24717095

Genes and signaling networks regulated during zebrafish optic vesicle morphogenesis.
Yin J, Morrissey ME, Shine L, Kennedy C, Higgins DG, Kennedy BN., BMC Genomics 15(), 2014
PMID: 25266257

Transposable elements modulate human RNA abundance and splicing via specific RNA-protein interactions.
Kelley DR, Hendrickson DG, Tenen D, Rinn JL., Genome Biol 15(12), 2014
PMID: 25572935

High-resolution detection of DNA binding sites of the global transcriptional regulator GlxR in Corynebacterium glutamicum.
Jungwirth B, Sala C, Kohl TA, Uplekar S, Baumbach J, Cole ST, Pühler A, Tauch A., Microbiology 159(pt 1), 2013
PMID: 23103979

Global mapping of transcription start sites and promoter motifs in the symbiotic α-proteobacterium Sinorhizobium meliloti 1021.
Schlüter JP, Reinkensmeier J, Barnett MJ, Lang C, Krol E, Giegerich R, Long SR, Becker A., BMC Genomics 14(), 2013
PMID: 23497287

Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns.
Meyer F, Kurtz S, Beckstette M., BMC Bioinformatics 14(), 2013
PMID: 23865810

Combining docking site and phosphosite predictions to find new substrates: identification of smoothelin-like-2 (SMTNL2) as a c-Jun N-terminal kinase (JNK) substrate.
Gordon EA, Whisenant TC, Zeller M, Kaake RM, Gordon WM, Krotee P, Patel V, Huang L, Baldi P, Bardwell L., Cell Signal 25(12), 2013
PMID: 23981301

Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers.
Varshney RK, Chen W, Li Y, Bharti AK, Saxena RK, Schlueter JA, Donoghue MT, Azam S, Fan G, Whaley AM, Farmer AD, Sheridan J, Iwata A, Tuteja R, Penmetsa RV, Wu W, Upadhyaya HD, Yang SP, Shah T, Saxena KB, Michael T, McCombie WR, Yang B, Zhang G, Yang H, Wang J, Wang J, Spillane C, Cook DR, May GD, Xu X, Jackson SA., Nat Biotechnol 30(1), 2012
PMID: 22057054

CoryneRegNet 6.0--Updated database content, new analysis methods and novel features focusing on community demands.
Pauling J, Röttger R, Tauch A, Azevedo V, Baumbach J., Nucleic Acids Res 40(database issue), 2012
PMID: 22080556

On the trail of EHEC/EAEC--unraveling the gene regulatory networks of human pathogenic Escherichia coli bacteria.
Pauling J, Röttger R, Neuner A, Salgado H, Collado-Vides J, Kalaghatgi P, Azevedo V, Tauch A, Pühler A, Baumbach J., Integr Biol (Camb) 4(7), 2012
PMID: 22318347

Designing filters for fast-known NcRNA identification.
Sun Y, Buhler J, Yuan C., IEEE/ACM Trans Comput Biol Bioinform 9(3), 2012
PMID: 22084145

Regulatory Snapshots: integrative mining of regulatory modules from expression time series and regulatory networks.
Gonçalves JP, Aires RS, Francisco AP, Madeira SC., PLoS One 7(5), 2012
PMID: 22563474

Phenylacetic acid catabolism and its transcriptional regulation in Corynebacterium glutamicum.
Chen X, Kohl TA, Rückert C, Rodionov DA, Li LH, Ding JY, Kalinowski J, Liu SJ., Appl Environ Microbiol 78(16), 2012
PMID: 22685150

Large-scale development of cost-effective single-nucleotide polymorphism marker assays for genetic mapping in pigeonpea and comparative mapping in legumes.
Saxena RK, Penmetsa RV, Upadhyaya HD, Kumar A, Carrasquilla-Garcia N, Schlueter JA, Farmer A, Whaley AM, Sarma BK, May GD, Cook DR, Varshney RK., DNA Res 19(6), 2012
PMID: 23103470

Finding significant matches of position weight matrices in linear time.
Pizzi C, Rastas P, Ukkonen E., IEEE/ACM Trans Comput Biol Bioinform 8(1), 2011
PMID: 21071798

Prediction and validation of promoters involved in the abscisic acid response in Physcomitrella patens.
Timmerhaus G, Hanke ST, Buchta K, Rensing SA., Mol Plant 4(4), 2011
PMID: 21398384

Defining the transcriptome assembly and its use for genome dynamics and transcriptome profiling studies in pigeonpea (Cajanus cajan L.).
Dubey A, Farmer A, Schlueter J, Cannon SB, Abernathy B, Tuteja R, Woodward J, Shah T, Mulasmanovic B, Kudapa H, Raju NL, Gothalwal R, Pande S, Xiao Y, Town CD, Singh NK, May GD, Jackson S, Varshney RK., DNA Res 18(3), 2011
PMID: 21565938

Structator: fast index-based search for RNA sequence-structure patterns.
Meyer F, Kurtz S, Backofen R, Will S, Beckstette M., BMC Bioinformatics 12(), 2011
PMID: 21619640

STEME: efficient EM to find motifs in large data sets.
Reid JE, Wernisch L., Nucleic Acids Res 39(18), 2011
PMID: 21785132

PATRIC: the comprehensive bacterial bioinformatics resource with a focus on human pathogenic species.
Gillespie JJ, Wattam AR, Cammer SA, Gabbard JL, Shukla MP, Dalay O, Driscoll T, Hix D, Mane SP, Mao C, Nordberg EK, Scott M, Schulman JR, Snyder EE, Sullivan DE, Wang C, Warren A, Williams KP, Xue T, Yoo HS, Zhang C, Zhang Y, Will R, Kenyon RW, Sobral BW., Infect Immun 79(11), 2011
PMID: 21896772

TFRank: network-based prioritization of regulatory associations underlying transcriptional responses.
Gonçalves JP, Francisco AP, Mira NP, Teixeira MC, Sá-Correia I, Oliveira AL, Madeira SC., Bioinformatics 27(22), 2011
PMID: 21965816

Genome sequence of the palaeopolyploid soybean.
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA., Nature 463(7278), 2010
PMID: 20075913

A genome-wide survey of sRNAs in the symbiotic nitrogen-fixing alpha-proteobacterium Sinorhizobium meliloti.
Schlüter JP, Reinkensmeier J, Daschkey S, Evguenieva-Hackenberg E, Janssen S, Jänicke S, Becker JD, Giegerich R, Becker A., BMC Genomics 11(), 2010
PMID: 20398411

ChIP-seq and functional analysis of the SOX2 gene in colorectal cancers.
Fang X, Yu W, Li L, Shao J, Zhao N, Chen Q, Ye Z, Lin SC, Zheng S, Lin B., OMICS 14(4), 2010
PMID: 20726797

On the power and limits of evolutionary conservation--unraveling bacterial gene regulatory networks.
Baumbach J., Nucleic Acids Res 38(22), 2010
PMID: 20699275

Regulation of the 18 kDa heat shock protein in Mycobacterium ulcerans: an alpha-crystallin orthologue that promotes biofilm formation.
Pidot SJ, Porter JL, Tobias NJ, Anderson J, Catmull D, Seemann T, Kidd S, Davies JK, Reynolds E, Dashper S, Stinear TP., Mol Microbiol 78(5), 2010
PMID: 21091506

Evolution of gene regulation of pluripotency--the case for wiki tracks at genome browsers.
Fuellen G, Struckmann S., Biol Direct 5(), 2010
PMID: 21190561

The complete genome sequence of Corynebacterium pseudotuberculosis FRC41 isolated from a 12-year-old girl with necrotizing lymphadenitis reveals insights into gene-regulatory networks contributing to virulence.
Trost E, Ott L, Schneider J, Schröder J, Jaenicke S, Goesmann A, Husemann P, Stoye J, Dorella FA, Rocha FS, Soares Sde C, D'Afonseca V, Miyoshi A, Ruiz J, Silva A, Azevedo V, Burkovski A, Guiso N, Join-Lambert OF, Kayal S, Tauch A., BMC Genomics 11(), 2010
PMID: 21192786

Towards the integrated analysis, visualization and reconstruction of microbial gene regulatory networks.
Baumbach J, Tauch A, Rahmann S., Brief Bioinform 10(1), 2009
PMID: 19074493

Integrated analysis and reconstruction of microbial transcriptional gene regulatory networks using CoryneRegNet.
Baumbach J, Wittkop T, Kleindt CK, Tauch A., Nat Protoc 4(6), 2009
PMID: 19498379

Reliable transfer of transcriptional gene regulatory networks between taxonomically related organisms.
Baumbach J, Rahmann S, Tauch A., BMC Syst Biol 3(), 2009
PMID: 19146695

Genetic makeup of the Corynebacterium glutamicum LexA regulon deduced from comparative transcriptomics and in vitro DNA band shift assays.
Jochmann N, Kurze AK, Czaja LF, Brinkrolf K, Brune I, Hüser AT, Hansmeier N, Pühler A, Borovok I, Tauch A., Microbiology 155(pt 5), 2009
PMID: 19372162

MotifAdjuster: a tool for computational reassessment of transcription factor binding site annotations.
Keilwagen J, Baumbach J, Kohl TA, Grosse I., Genome Biol 10(5), 2009
PMID: 19409082

From Corynebacterium glutamicum to Mycobacterium tuberculosis--towards transfers of gene regulatory networks and integrated data analyses with MycoRegNet.
Krawczyk J, Kohl TA, Goesmann A, Kalinowski J, Baumbach J., Nucleic Acids Res 37(14), 2009
PMID: 19494184

D-MATRIX: a web tool for constructing weight matrix of conserved DNA motifs.
Sen N, Mishra M, Khan F, Meena A, Sharma A., Bioinformation 3(10), 2009
PMID: 19759861

The GlxR regulon of the amino acid producer Corynebacterium glutamicum: Detection of the corynebacterial core regulon and integration into the transcriptional regulatory network model.
Kohl TA, Tauch A., J Biotechnol 143(4), 2009
PMID: 19665500

MOODS: fast search for position weight matrix matches in DNA sequences.
Korhonen J, Martinmäki P, Pizzi C, Rastas P, Ukkonen E., Bioinformatics 25(23), 2009
PMID: 19773334

Discovery of regulatory elements is improved by a discriminatory approach.
Valen E, Sandelin A, Winther O, Krogh A., PLoS Comput Biol 5(11), 2009
PMID: 19911049

Mycolactone gene expression is controlled by strong SigA-like promoters with utility in studies of Mycobacterium ulcerans and buruli ulcer.
Tobias NJ, Seemann T, Pidot SJ, Porter JL, Marsollier L, Marion E, Letournel F, Zakir T, Azuolas J, Wallace JR, Hong H, Davies JK, Howden BP, Johnson PD, Jenkin GA, Stinear TP., PLoS Negl Trop Dis 3(11), 2009
PMID: 19936295

A probabilistic method for small RNA flowgram matching.
Vacic V, Jin H, Zhu JK, Lonardi S., Pac Symp Biocomput (), 2008
PMID: 18229677

Natural similarity measures between position frequency matrices with an application to clustering.
Pape UJ, Rahmann S, Vingron M., Bioinformatics 24(3), 2008
PMID: 18174183

ReXSpecies--a tool for the analysis of the evolution of gene regulation across species.
Struckmann S, Araúzo-Bravo MJ, Schöler HR, Reinbold RA, Fuellen G., BMC Evol Biol 8(), 2008
PMID: 18410675

The GlxR regulon of the amino acid producer Corynebacterium glutamicum: in silico and in vitro detection of DNA binding sites of a global transcription regulator.
Kohl TA, Baumbach J, Jungwirth B, Pühler A, Tauch A., J Biotechnol 135(4), 2008
PMID: 18573287

Efficient representation and P-value computation for high-order Markov motifs.
da Fonseca PG, Guimarães KS, Sagot MF., Bioinformatics 24(16), 2008
PMID: 18689819

Determining significance of pairwise co-occurrences of events in bursty sequences.
Haiminen N, Mannila H, Terzi E., BMC Bioinformatics 9(), 2008
PMID: 18691400

Computational detection of significant variation in binding affinity across two sets of sequences with application to the analysis of replication origins in yeast.
Keich U, Gao H, Garretson JS, Bhaskar A, Liachko I, Donato J, Tye BK., BMC Bioinformatics 9(), 2008
PMID: 18786274

Performance evaluation of DNA motif discovery programs.
Singh CP, Khan F, Mishra BN, Chauhan DS., Bioinformation 3(5), 2008
PMID: 19255635

CoryneRegNet 3.0--an interactive systems biology platform for the analysis of gene regulatory networks in corynebacteria and Escherichia coli.
Baumbach J, Wittkop T, Rademacher K, Rahmann S, Brinkrolf K, Tauch A., J Biotechnol 129(2), 2007
PMID: 17229482

Computing exact P-values for DNA motifs.
Zhang J, Jiang B, Li M, Tromp J, Zhang X, Zhang MQ., Bioinformatics 23(5), 2007
PMID: 17237046

Statistical significance of cis-regulatory modules.
Schones DE, Smith AD, Zhang MQ., BMC Bioinformatics 8(), 2007
PMID: 17241466

Large scale clustering of protein sequences with FORCE -A layout based heuristic for weighted cluster editing.
Wittkop T, Baumbach J, Lobo FP, Rahmann S., BMC Bioinformatics 8(), 2007
PMID: 17941985

CoryneRegNet 4.0 - A reference database for corynebacterial gene regulatory networks.
Baumbach J., BMC Bioinformatics 8(), 2007
PMID: 17986320

Reduced alphabet motif methodology for GPCR annotation.
Gangal R, Kumar KK., J Biomol Struct Dyn 25(3), 2007
PMID: 17937491

Efficient and accurate P-value computation for Position Weight Matrices.
Touzet H, Varré JS., Algorithms Mol Biol 2(), 2007
PMID: 18072973

Fast index based algorithms and software for matching position specific scoring matrices.
Beckstette M, Homann R, Giegerich R, Kurtz S., BMC Bioinformatics 7(), 2006
PMID: 16930469

36 References

Daten bereitgestellt von Europe PubMed Central.

Profile analysis: detection of distantly related proteins.
Gribskov M, McLachlan AD, Eisenberg D., Proc. Natl. Acad. Sci. U.S.A. 84(13), 1987
PMID: 3474607

Recent improvements to the PROSITE database.
Hulo N, Sigrist CJ, Le Saux V, Langendijk-Genevaux PS, Bordoli L, Gattiker A, De Castro E, Bucher P, Bairoch A., Nucleic Acids Res. 32(Database issue), 2004
PMID: 14681377

PRINTS and its automatic supplement, prePRINTS.
Attwood TK, Bradley P, Flower DR, Gaulton A, Maudling N, Mitchell AL, Moulton G, Nordle A, Paine K, Taylor P, Uddin A, Zygouri C., Nucleic Acids Res. 31(1), 2003
PMID: 12520033

Increased coverage of protein families with the blocks database servers.
Henikoff JG, Greene EA, Pietrokovski S, Henikoff S., Nucleic Acids Res. 28(1), 2000
PMID: 10592233

Minimal-risk scoring matrices for sequence analysis
Wu T, Nevill-Manning C, Brutlag D., 1999

JASPAR: an open-access database for eukaryotic transcription factor binding profiles.
Sandelin A, Alkema W, Engstrom P, Wasserman WW, Lenhard B., Nucleic Acids Res. 32(Database issue), 2004
PMID: 14681366

TRANSFAC: transcriptional regulation, from patterns to profiles.
Matys V, Fricke E, Geffers R, Gossling E, Haubrock M, Hehl R, Hornischer K, Karas D, Kel AE, Kel-Margoulis OV, Kloos DU, Land S, Lewicki-Potapov B, Michael H, Munch R, Reuter I, Rotert S, Saxel H, Scheer M, Thiele S, Wingender E., Nucleic Acids Res. 31(1), 2003
PMID: 12520026

FingerPRINTScan: intelligent searching of the PRINTS motif database.
Scordis P, Flower DR, Attwood TK., Bioinformatics 15(10), 1999
PMID: 10705433

MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data.
Quandt K, Frech K, Karas H, Wingender E, Werner T., Nucleic Acids Res. 23(23), 1995
PMID: 8532532

The efficient computation of position-specific match scores with the fast fourier transform.
Rajasekaran S, Jin X, Spouge JL., J. Comput. Biol. 9(1), 2002
PMID: 11911793

Using sequence compression to speedup probabilistic profile matching.
Freschi V, Bogliolo A., Bioinformatics 21(10), 2005
PMID: 15713733

Fast probabilistic analysis of sequence function using scoring matrices.
Wu TD, Nevill-Manning CG, Brutlag DL., Bioinformatics 16(3), 2000
PMID: 10869016

Accelerating Protein Classification Using Suffix Trees
Dorohonceanu B, Nevill-Manning C., 2000

Some string matching problems from Bioinformatics which still need better solutions
Gonnet H., 2004

Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks.
Tatusov RL, Altschul SF, Koonin EV., Proc. Natl. Acad. Sci. U.S.A. 91(25), 1994
PMID: 7991589

Using substitution probabilities to improve position-specific scoring matrices
Henikoff J, Henikoff S., 1996

MATCH: A tool for searching transcription factor binding sites in DNA sequences.
Kel AE, Gossling E, Reuter I, Cheremushkin E, Kel-Margoulis OV, Wingender E., Nucleic Acids Res. 31(13), 2003
PMID: 12824369

Replacing Suffix Trees with Enhanced Suffix Arrays
Abouelhoda M, Kurtz S, Ohlebusch E., 2004

Reducing the Space Requirement of Suffix Trees
Kurtz S., 1999

A Comparison of Imperative and Purely Functional Suffix Tree Constructions
Giegerich R, Kurtz S., 1995

Methods for calculating the probabilities of finding patterns in sequences.
Staden R., Comput. Appl. Biosci. 5(2), 1989
PMID: 2720468

Dynamic programming algorithms for two statistical problems in computational biology
Rahmann S., 2003

On the Power of Profiles for Transcription Factor Binding Site Detection
Rahmann S, Müller T, Vingron M., 2003

Fast index based algorithms and software for matching position specific scoring matrices.
Beckstette M, Homann R, Giegerich R, Kurtz S., BMC Bioinformatics 7(), 2006
PMID: 16930469

PoSSuMsearch: Fast and Sensitive Matching of Position Specific Scoring Matrices using Enhanced Suffix Arrays
Beckstette M, Strothmann D, Homann R, Giegerich R, Kurtz S., 2004

Simple Linear Work Suffix Array Construction
Kärkkäinen J, Sanders P., 2003

Linear-time Longest-Common-Prefix Computation in Suffix Arrays and its Applications
Kasai T, Lee G, Arimura H, Arikawa S, Park K., 2001

A Combinatorial Problem
de N., 1946

Reduction of protein sequence complexity by residue grouping.
Li T, Fan K, Wang J, Wang W., Protein Eng. 16(5), 2003
PMID: 12826723

Simplified amino acid alphabets for protein fold recognition and implications for folding.
Murphy LR, Wallqvist A, Levy RM., Protein Eng. 13(3), 2000
PMID: 10775656

Castillo G., 1988

Embrechts P, Klüppelberg C, Mikosch T., 1997

Approximations to profile score distributions
Goldstein L, Waterman M., 1994

The Vmatch large scale sequence analysis software
Kurtz S., 2005

CisML: an XML-based format for sequence motif detection software.
Haverty PM, Weng Z., Bioinformatics 20(11), 2004
PMID: 15001475

A H+-gated urea channel: the link between Helicobacter pylori urease and gastric colonization.
Weeks DL, Eskandari S, Scott DR, Sachs G., Science 287(5452), 2000
PMID: 10642549

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®

Quellen

PMID: 16930469
PubMed | Europe PMC

Suchen in

Google Scholar

PUB - Publikationen an der Universität Bielefeld