Article information
2000 , Volume 5, Special issue, p.5-15
Orlov Y.L., Potapov V.N.
Estimation of stochastic complexity of genetical texts
Analysis of nucleotide sequence structure is of great importance for recognition and annotation of functional genomic sequences. The method was developed for constructing the source-tree model (the variant of the hidden Markov model) of symbol sequences of DNA and proteins. Estimation of stochastic complexity of the data in the frames of a model serves as a criterion for the model\'s ascertainment. The software realization of this algorithm enables to reveal statistical properties of sequences, which are informative for functional site recognition. By the algorithm suggested, the data extracted from the Samples database ( http://wwwmgs.bionet.nsc.ru/cgi-bin/mgs/nsamples/) and the Protein Data Bank (PDB) were analyzed. The program developed is available via the Internet (http://wwwmgs.bionet.nsc.ru/mgs/programs/hmm/).
Author(s): Orlov Yu.L. Office: Institute of Cytology and Genetics SB RAS Address: Russia, Novosibirsk
E-mail: orlov@bionet.nsc.ru Potapov V.N. Office: Institute of Mathematics SB RAS Address: Russia, Novosibirsk
E-mail: potapov@math.nsc.ru
Bibliography link: Orlov Y.L., Potapov V.N. Estimation of stochastic complexity of genetical texts // Computational technologies. 2000. V. 5. The special issue is devoted to the 10-th anniversary of the Laboratory of Theoretical Genetics of the Institute of Cytology and Genetics SB RAS. P. 5-15
|
|
|