Subsequence combinatorics and applications to microarray production, DNA sequencing and chaining algorithms Rahmann, Sven Lewenstein, Moshe Valiente, Gabriel We investigate combinatorial enumeration problems related to subsequences of strings; in contrast to substrings, subsequences need not be contiguous. For a finite alphabet Sigma, the following three problems are solved. (1) Number of distinct subsequences: Given a sequence s is an element of Sigma(n) and a nonnegative integer k <= n, how many distinct subsequences of length k does s contain? A previous result by Chase states that this number is maximized by choosing s as a repeated permutation of the alphabet. This has applications in DNA microarray production. (2) Number of rho-restricted rho-generated sequences: Given s is an element of Sigma(n) and integers k >= 1 and rho >= 1, how many distinct sequences in Sigma(k) contain no single nucleotide repeat longer than rho and can be written as s(1)(r1)... s(n)(rn) with 0 <= r(i) <= rho for all i? For rho = infinity, the question becomes how many length-k sequences match the regular expression s(1)*s(2)*... s(n)*. These considerations allow a detailed analysis of a new DNA sequencing technology ("454 sequencing"). (3) Exact length distribution of the longest increasing subsequence: Given Sigma = {1, ..., K} and an integer n >= 1, determine the number of sequences in Sigma(n) whose longest strictly increasing subsequence has length k, where 0 <= k <= K. This has applications to significance computations for chaining algorithms. Springer 2006 info:eu-repo/semantics/conferenceObject doc-type:conferenceObject text https://pub.uni-bielefeld.de/record/1598249 Rahmann S. Subsequence combinatorics and applications to microarray production, DNA sequencing and chaining algorithms. In: Lewenstein M, Valiente G, eds. <em>Combinatorial Pattern Matching. 17th Annual Symposium, CPM 2006, Barcelona, Spain, July 5-7, 2006. Proceedings</em>. Lecture Notes in Computer Science. Vol 4009. Berlin: Springer; 2006: 153-164. eng info:eu-repo/semantics/altIdentifier/doi/10.1007/11780441_15 info:eu-repo/semantics/altIdentifier/isbn/978-3-540-35455-0 info:eu-repo/semantics/altIdentifier/wos/000239421700015 info:eu-repo/semantics/closedAccess