中国物理B ›› 2010, Vol. 19 ›› Issue (1): 10205-010205.doi: 10.1088/1674-1056/19/1/010205

• GENERAL • 上一篇    下一篇

Wavelet-based multifractal analysis of DNA sequences by using chaos-game representation

韩佳静, 符维娟   

  1. Surface Physics Laboratory (National Key Laboratory) and Physics Department, Fudan University, Shanghai 200433, China
  • 收稿日期:2009-03-24 修回日期:2009-06-09 出版日期:2010-01-15 发布日期:2010-01-15
  • 基金资助:
    Project supported by the Science and Technology Commission of Shanghai Municipality (Grant No. 05DZ19747) and the National Basic Research Program of China (Grant No. 2006CB504509).

Wavelet-based multifractal analysis of DNA sequences by using chaos-game representation

Han Jia-Jing(韩佳静) and Fu Wei-Juan (符维娟)   

  1. Surface Physics Laboratory (National Key Laboratory) and Physics Department, Fudan University, Shanghai 200433, China
  • Received:2009-03-24 Revised:2009-06-09 Online:2010-01-15 Published:2010-01-15
  • Supported by:
    Project supported by the Science and Technology Commission of Shanghai Municipality (Grant No. 05DZ19747) and the National Basic Research Program of China (Grant No. 2006CB504509).

摘要: Chaos game representation (CGR) is proposed as a scale-independent representation for DNA sequences and provides information about the statistical distribution of oligonucleotides in a DNA sequence. CGR images of DNA sequences represent some kinds of fractal patterns, but the common multifractal analysis based on the box counting method cannot deal with CGR images perfectly. Here, the wavelet transform modulus maxima (WTMM) method is applied to the multifractal analysis of CGR images. The results show that the scale-invariance range of CGR edge images can be extended to three orders of magnitude, and complete singularity spectra can be calculated. Spectrum parameters such as the singularity spectrum span are extracted to describe the statistical character of DNA sequences. Compared with the singularity spectrum span, exon sequences with a minimal spectrum span have the most uniform fractal structure. Also, the singularity spectrum parameters are related to oligonucleotide length, sequence component and species, thereby providing a method of studying the length polymorphism of repeat oligonucleotides.

Abstract: Chaos game representation (CGR) is proposed as a scale-independent representation for DNA sequences and provides information about the statistical distribution of oligonucleotides in a DNA sequence. CGR images of DNA sequences represent some kinds of fractal patterns, but the common multifractal analysis based on the box counting method cannot deal with CGR images perfectly. Here, the wavelet transform modulus maxima (WTMM) method is applied to the multifractal analysis of CGR images. The results show that the scale-invariance range of CGR edge images can be extended to three orders of magnitude, and complete singularity spectra can be calculated. Spectrum parameters such as the singularity spectrum span are extracted to describe the statistical character of DNA sequences. Compared with the singularity spectrum span, exon sequences with a minimal spectrum span have the most uniform fractal structure. Also, the singularity spectrum parameters are related to oligonucleotide length, sequence component and species, thereby providing a method of studying the length polymorphism of repeat oligonucleotides.

Key words: chaos game representation (CGR), multifractal, wavelet transform modulus maxima(WTMM), singularity spectrum

中图分类号:  (Nucleic acids)

  • 87.14.G-
05.45.Df (Fractals) 87.15.Cc (Folding: thermodynamics, statistical mechanics, models, and pathways)