Please wait a minute...
Chin. Phys. B, 2008, Vol. 17(12): 4396-4400    DOI: 10.1088/1674-1056/17/12/011
GENERAL Prev   Next  

A combined statistical model for multiple motifs search

Gao Li-Feng (高丽锋)a, Liu Xin (刘鑫)b, Guan Shan (官山)c
a Chinese Academy of Agriculture Science, Beijing 100081, China; b Institute of Theoretical Physics, Beijing 100080, China; c Physics science and technology Department, Yangzhou University, Yangzhou 225009, China
Abstract  Transcription factor binding sites (TFBS) play key roles in gene's expression and regulation. They are short sequence segments with definite structure and can be recognized by the corresponding transcription factors correctly. From the viewpoint of statistics, the candidates of TFBS should be quite different from the segments that are randomly combined together by nucleotide. This paper proposes a combined statistical model for finding over-represented short sequence segments in different kinds of data set. While the over-represented short sequence segment is described by position weight matrix, the nucleotide distribution at most sites of the segment should be far from the background nucleotide distribution. The central idea of this approach is to search for such kind of signals. This algorithm is tested on 3 data sets, including binding sites data set of cyclic AMP receptor protein in E.coli, PlantProm DB which is a non-redundant collection of proximal promoter sequences from different species, collection of the intergenic sequences of the whole genome of E.Coli. Even though the complexity of these three data sets is quite different, the results show that this model is rather general and sensible.
Keywords:  transcription factor binding sites      motif      position weight matrix  
Received:  03 January 2008      Revised:  13 February 2008      Accepted manuscript online: 
PACS:  87.16.Yc (Regulatory genetic and chemical networks)  
  87.14.E- (Proteins)  
  87.15.A- (Theory, modeling, and computer simulation)  
  87.15.B- (Structure of biomolecules)  
  87.15.Cc (Folding: thermodynamics, statistical mechanics, models, and pathways)  
Fund: Project supported by the National Science Foundation of China (Grant No 70671089), and the Key Important Project (No 10635040).

Cite this article: 

Gao Li-Feng (高丽锋), Liu Xin (刘鑫), Guan Shan (官山) A combined statistical model for multiple motifs search 2008 Chin. Phys. B 17 4396

[1] Influence of coupling asymmetry on signal amplification in a three-node motif
Xiaoming Liang(梁晓明), Chao Fang(方超), Xiyun Zhang(张希昀), and Huaping Lü(吕华平). Chin. Phys. B, 2023, 32(1): 010504.
[2] Robustness measurement of scale-free networks based on motif entropy
Yun-Yun Yang(杨云云), Biao Feng(冯彪), Liao Zhang(张辽), Shu-Hong Xue(薛舒红), Xin-Lin Xie(谢新林), and Jian-Rong Wang(王建荣). Chin. Phys. B, 2022, 31(8): 080201.
[3] Mechano-chemical selections of two competitive unfolding pathways of a single DNA i-motif
Xu Yue (徐悦), Chen Hu (陈虎), Qu Yu-Jie (璩玉杰), Artem K. Efremov, Li Ming (黎明), Ouyang Zhong-Can(欧阳钟灿) , Liu Dong-Sheng(刘冬生), Yan Jie (严洁)​​. Chin. Phys. B, 2014, 23(6): 068702.
[4] Formation and dissociation of protonated cytosine–cytosine base pairs in i-motifs by ab initio quantum chemical calculations
Zhang Xiao-Hu (张小虎), Li Ming (黎明), Wang Yan-Ting (王延颋), Ouyang Zhong-Can (欧阳钟灿). Chin. Phys. B, 2014, 23(2): 020702.
[5] Oscillatory and anti-oscillatory motifs in genetic regulatory networks
Ye Wei-Ming(叶纬明), Zhang Zhao-Yang(张朝阳), LŰ Bin-Bin(吕彬彬), Di Zeng-Ru(狄增如), and Hu Gang(胡岗) . Chin. Phys. B, 2012, 21(6): 060203.
[6] Noise transmission and delay-induced stochastic oscillations in biochemical network motifs
Liu Sheng-Jun(刘圣君), Wang Qi(王祺), Liu Bo(刘波), Yan Shi-Wei(晏世伟), and Fumihiko Sakata . Chin. Phys. B, 2011, 20(12): 128703.
[7] Synchronization between different motifs
Li Ying(李莹) and Liu Zeng-Rong(刘曾荣). Chin. Phys. B, 2010, 19(11): 110501.
[8] Dynamics of network motifs in genetic regulatory networks
Li Ying(李莹), Liu Zeng-Rong(刘曾荣), and Zhang Jian-Bao(张建宝). Chin. Phys. B, 2007, 16(9): 2587-2594.
No Suggested Reading articles found!