Article information

2000 , Volume 5, Special issue, p.67-74

Valuev V.P., Kuropatov D.A.

Automatic generation of recognition programs for amino acid sequences

Rapidly increasing number of sequenced genes and proteins determines an urge for inexpensive and fast tools for data annotation. There exist a large number of elaborate specific tools and databases, aimed at precise recognition of certain motifs, domains or other structural and functional elements. At the same time, they all rely on relatively limited number of recognition methods, which are: consensus, hidden Markov models, profiles, perceptron, neural networks cite{Gelfand1995,BaldiBrunak1998}, but there is no precise knowledge which of these methods will suit a given purpose best. What we propose is to apply different simple independent methods simultaneously instead of working out more sophisticated implementations of one algorithm. In this paper a new approach to the amino acid sequence classification is developed. It implies a possibility of using simultaneously several pattern recognition methods. An Internet-available system allows to generate classification programs, based on different pattern recognition methods, for any protein structural or functional motif. Current release comprises 6 generation modules using consensus, linear Fisher discriminant, perceptron, and simple profile methods. These modules were tested on major eukaryotic DNA-binding domains, such as homeobox, zinc finger, nuclear receptor, leucine zipper. When applying these modules by one, we obtained from 0 to 10 percent false positives and from 0 to 8 percent false negatives. Simultaneous application of all of them allowed to lower false positives and false negatives percentage down to 1.

Author(s):
Valuev V.P.
Office: Institute of Cytology and Genetics SB RAS
Address: Russia, Novosibirsk
E-mail: valuev@bionet.nsc.ru

Kuropatov D.A.
Office: Institute of Cytology and Genetics SB RAS
Address: Russia, Novosibirsk
E-mail: valuev@bionet.nsc.ru


Bibliography link:
Valuev V.P., Kuropatov D.A. Automatic generation of recognition programs for amino acid sequences // Computational technologies. 2000. V. 5. The special issue is devoted to the 10-th anniversary of the Laboratory of Theoretical Genetics of the Institute of Cytology and Genetics SB RAS. P. 67-74
Home| Scope| Editorial Board| Content| Search| Subscription| Rules| Contacts
ISSN 1560-7534
© 2024 FRC ICT