Statistical geometry in sequence space: a method of quantitative comparative sequence analysis
A statistical method of comparative sequence analysis that combines horizontal and vertical correlations among aligned sequences is introduced. It is based on the analysis mainly of quartet combinations of sequences considered as geometric configurations in sequence space. Numerical invariants related to relative internal segment lengths are assigned to each such configuration and statistical averages of these invariants are established. They are used for internal calibration of the topology of divergence and for quantitative determination of the noise level. Comparison of computer simulations with experimental data reveals the high sensitivity of assignment of basic topologies even if much randomized. In addition, these procedures are checked by vertical analysis of the aligned sequences to allow the study of divergences with positionally varying substitution probabilities.
85
16
5913-5917
5913-5917
National Academy of Sciences
1