中国物理B ›› 2010, Vol. 19 ›› Issue (6): 68701-068701.doi: 10.1088/1674-1056/19/6/068701

• • 上一篇    

Chaos game representation of functional protein sequences, and simulation and multifractal analysis of induced measures

余君武1, Vo Anh2, 肖前军3, 石龙3, 喻祖国4   

  1. (1)Department of Mathematics and Computational Science, Hunan University of Science and Technology, Xiangtan 411201, China; (2)School of Mathematical Sciences, Queensland University of Technology, GPO Box 2434, Brisbane, Q 4001, Australia; (3)School of Mathematics and Computational Science, Xiangtan University, Xiangtan 411105, China; (4)School of Mathematics and Computational Science, Xiangtan University, Xiangtan 411105, China;School of Mathematical Sciences, Queensland University of Technology, GPO Box 2434, Brisbane, Q 4001, Australia
  • 收稿日期:2009-09-30 出版日期:2010-06-15 发布日期:2010-06-15
  • 基金资助:
    Project partially supported by the National Natural Science Foundation of China (Grant No.~30570426), the Chinese Program for New Century Excellent Talents in University (Grant No.~NCET-08-06867), Fok Ying Tung Education Foundation (Grant No.~101004), and

Chaos game representation of functional protein sequences, and simulation and multifractal analysis of induced measures

Yu Zu-Guo(喻祖国) a)b)†, Xiao Qian-Jun(肖前军)a), Shi Long(石龙)a), Yu Jun-Wu(余君武)c), and Vo Anhb)   

  1. a School of Mathematics and Computational Science, Xiangtan University, Xiangtan 411105, China; b School of Mathematical Sciences, Queensland University of Technology, GPO Box 2434, Brisbane, Q 4001, Australia; c Department of Mathematics and Computational Science, Hunan University of Science and Technology, Xiangtan 411201, China
  • Received:2009-09-30 Online:2010-06-15 Published:2010-06-15
  • Supported by:
    Project partially supported by the National Natural Science Foundation of China (Grant No.~30570426), the Chinese Program for New Century Excellent Talents in University (Grant No.~NCET-08-06867), Fok Ying Tung Education Foundation (Grant No.~101004), and

摘要: Investigating the biological function of proteins is a key aspect of protein studies. Bioinformatic methods become important for studying the biological function of proteins. In this paper, we first give the chaos game representation (CGR) of randomly-linked functional protein sequences, then propose the use of the recurrent iterated function systems (RIFS) in fractal theory to simulate the measure based on their chaos game representations. This method helps to extract some features of functional protein sequences, and furthermore the biological functions of these proteins. Then multifractal analysis of the measures based on the CGRs of randomly-linked functional protein sequences are performed. We find that the CGRs have clear fractal patterns. The numerical results show that the RIFS can simulate the measure based on the CGR very well. The relative standard error and the estimated probability matrix in the RIFS do not depend on the order to link the functional protein sequences. The estimated probability matrices in the RIFS with different biological functions are evidently different. Hence the estimated probability matrices in the RIFS can be used to characterise the difference among linked functional protein sequences with different biological functions. From the values of the D_q curves, one sees that these functional protein sequences are not completely random. The D_q of all linked functional proteins studied are multifractal-like and sufficiently smooth for the C_q (analogous to specific heat) curves to be meaningful. Furthermore, the D_q curves of the measure \mu based on their CGRs for different orders to link the functional protein sequences are almost identical if q\geq 0. Finally, the C_q curves of all linked functional proteins resemble a classical phase transition at a critical point.

Abstract: Investigating the biological function of proteins is a key aspect of protein studies. Bioinformatic methods become important for studying the biological function of proteins. In this paper, we first give the chaos game representation (CGR) of randomly-linked functional protein sequences, then propose the use of the recurrent iterated function systems (RIFS) in fractal theory to simulate the measure based on their chaos game representations. This method helps to extract some features of functional protein sequences, and furthermore the biological functions of these proteins. Then multifractal analysis of the measures based on the CGRs of randomly-linked functional protein sequences are performed. We find that the CGRs have clear fractal patterns. The numerical results show that the RIFS can simulate the measure based on the CGR very well. The relative standard error and the estimated probability matrix in the RIFS do not depend on the order to link the functional protein sequences. The estimated probability matrices in the RIFS with different biological functions are evidently different. Hence the estimated probability matrices in the RIFS can be used to characterise the difference among linked functional protein sequences with different biological functions. From the values of the $D_q$ curves, one sees that these functional protein sequences are not completely random. The $D_q$ of all linked functional proteins studied are multifractal-like and sufficiently smooth for the $C_q$ (analogous to specific heat) curves to be meaningful. Furthermore, the $D_q$ curves of the measure $\mu$ based on their CGRs for different orders to link the functional protein sequences are almost identical if $q\geq 0$. Finally, the $C_q$ curves of all linked functional proteins resemble a classical phase transition at a critical point.

Key words: chaos game representation, recurrent iterated function systems, functional proteins, multifractal analysis

中图分类号:  (Proteins)

  • 87.14.E-
87.15.Cc (Folding: thermodynamics, statistical mechanics, models, and pathways) 87.15.H- (Dynamics of biomolecules) 87.15.B- (Structure of biomolecules) 02.50.Le (Decision theory and game theory)