中国物理B ›› 2009, Vol. 18 ›› Issue (10): 4571-4579.doi: 10.1088/1674-1056/18/10/079

• • 上一篇    下一篇

Chaos game representation walk model for the protein sequences

蒋丽丽1, 徐振源1, 高洁2   

  1. (1)School of Science, Jiangnan University, Wuxi 214122, China; (2)School of Science, Jiangnan University, Wuxi 214122, China;School of Information Technology, Jiangnan University, Wuxi 214122, China
  • 收稿日期:2008-12-09 修回日期:2009-04-09 出版日期:2009-10-20 发布日期:2009-10-20
  • 基金资助:
    Project supported by the National Natural Science Foundation of China (Grant No 60575038), the Natural Science Foundation of Jiangnan University, China (Grant No 20070365) and the Program for Innovative Research Team of Jiangnan University, China.

Chaos game representation walk model for the protein sequences

Gao Jie(高洁)a)b)†, Jiang Li-Li(蒋丽丽)a), and Xu Zhen-Yuan(徐振源)a)   

  1. a School of Science, Jiangnan University, Wuxi 214122, China; b School of Information Technology, Jiangnan University, Wuxi 214122, China
  • Received:2008-12-09 Revised:2009-04-09 Online:2009-10-20 Published:2009-10-20
  • Supported by:
    Project supported by the National Natural Science Foundation of China (Grant No 60575038), the Natural Science Foundation of Jiangnan University, China (Grant No 20070365) and the Program for Innovative Research Team of Jiangnan University, China.

摘要: A new chaos game representation of protein sequences based on the detailed hydrophobic--hydrophilic (HP) model has been proposed by Yu et al (Physica A 337 (2004) 171). A CGR-walk model is proposed based on the new CGR coordinates for the protein sequences from complete genomes in the present paper. The new CGR coordinates based on the detailed HP model are converted into a time series, and a long-memory ARFIMA(p, d, q) model is introduced into the protein sequence analysis. This model is applied to simulating real CGR-walk sequence data of twelve protein sequences. Remarkably long-range correlations are uncovered in the data and the results obtained from these models are reasonably consistent with those available from the ARFIMA(p, d, q) model.

Abstract: A new chaos game representation of protein sequences based on the detailed hydrophobic--hydrophilic (HP) model has been proposed by Yu et al (Physica A 337 (2004) 171). A CGR-walk model is proposed based on the new CGR coordinates for the protein sequences from complete genomes in the present paper. The new CGR coordinates based on the detailed HP model are converted into a time series, and a long-memory ARFIMA(p, d, q) model is introduced into the protein sequence analysis. This model is applied to simulating real CGR-walk sequence data of twelve protein sequences. Remarkably long-range correlations are uncovered in the data and the results obtained from these models are reasonably consistent with those available from the ARFIMA(p, d, q) model.

Key words: chaos game representation CGR-walk model , protein sequence, long-memory ARFIMA (p, d, q) model , autocorrelation function

中图分类号:  (Proteins)

  • 87.14.E-
05.40.Fb (Random walks and Levy flights) 87.15.A- (Theory, modeling, and computer simulation) 87.15.Cc (Folding: thermodynamics, statistical mechanics, models, and pathways)