Please wait a minute...
Chin. Phys. B, 2010, Vol. 19(11): 110502    DOI: 10.1088/1674-1056/19/11/110502
RAPID COMMUNICATION Prev   Next  

Prediction of protein binding sites using physical and chemical descriptors and the support vector machine regression method

Sun Zhong-Hua(孙重华)a)b) and Jiang Fan(江凡)a)†
a Beijing National Laboratory for Condensed Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China; b Graduate School of the Chinese Academy of Sciences, Beijing 100049, China
Abstract  In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using 0 and 1. So we can use the support vector machine regression method to fit the core-ratio value and predict the protein binding sites. We also design a new group of physical and chemical descriptors to characterize the binding sites. The new descriptors are more effective, with an averaging procedure used. Our test shows that much better prediction results can be obtained by the support vector regression (SVR) method than by the support vector classification method.
Keywords:  protein binding site      support vector machine regression      cross-validation      neighbour residue  
Received:  25 June 2010      Revised:  07 July 2010      Accepted manuscript online: 
PACS:  87.14.E- (Proteins)  
  87.15.A- (Theory, modeling, and computer simulation)  
  87.15.K- (Molecular interactions; membrane-protein interactions)  
  87.15.N- (Properties of solutions of macromolecules)  
Fund: Project supported by the National Natural Science Foundation of China (Grant Nos. 10674172 and 10874229).

Cite this article: 

Sun Zhong-Hua(孙重华) and Jiang Fan(江凡) Prediction of protein binding sites using physical and chemical descriptors and the support vector machine regression method 2010 Chin. Phys. B 19 110502

[1] Zhou H X and Qin S B 2007 Bioinformatics 23 2203-2209
[2] Smith J R and Sternberg M J 2002 Curr. Opin. Struct. Biol. 12 28
[3] Hu Z, Ma B, Wolfson H and Nussinov R 2000 Proteins 39 331
[4] Ma B, Elkayam T, Wolfson H and Nussinov 2003 Proc. Natl Acad. Sci. USA 100 5772
[5] Armon A, Graur Dan and Ben-Tal N 2001 J. Mol. Biol. 307 447
[6] de Vries S J, van Dijk A D J and Bovin A M J J 2006 Proteins 63 479
[7] Chen H and Zhou H X 2005 Proteins 61 21
[8] Janin J, Miller S and Chothia C 1988 J. Mol. Biol. 204 155
[9] Li N, Sun Z and Jiang F B M C 2008 Bioinformatics 9 553
[10] Chakrabarti P and Janin J 2002 Proteins 47 334
[11] Bahadur R P, Chakrabarti P, Rodier F and Janin J 2003 Proteins 53 708
[12] Vapnik V 1995 The Nature of Statistical Learning Theory (New York: Springer)
[13] Fan R E, Chen P H and Lin C J 2005 Journal of Machine Learning Research 6 1889
[14] Kabsch W and Sandor C 1983 Biopolymers Dec 22 2577
[15] Collaborative Computational Project Number 4. 1994 Acta Crystallogr D 50 760
[16] Lee B and Richards F M 1971 J. Mol. Biol. 14 379
[17] Zhang C, Vasmatzis G, Cornette J L and DeLisi C 1997 J. Mol. Biol. 267 707
[18] Gao L F, Liu X and Guan S 2008 Chin. Phys. B 17 4396
[19] Liu J F 2009 Chin. Phys. B 18 2615
[20] Jiang F and Li N 2007 Chin. Phys. 16 392
[21] Xiao Y and Yao K L 1994 Chin. Phys. 3 788
[22] Wang X H, Shen Y and Zhang L X 2009 Chin. Phys. B 18 1684 endfootnotesize
[1] Effect of the distance between focusing lens and target surface on quantitative analysis of Mn element in aluminum alloys by using filament-induced breakdown spectroscopy
Xue-Tong Lu(陆雪童), Shang-Yong Zhao(赵上勇), Xun Gao(高勋), Kai-Min Guo(郭凯敏), and Jing-Quan Lin(林景全). Chin. Phys. B, 2020, 29(12): 124209.
[2] Small-time scale network traffic prediction based on a local support vector machine regression model
Meng Qing-Fang(孟庆芳), Chen Yue-Hui(陈月辉), and Peng Yu-Hua(彭玉华). Chin. Phys. B, 2009, 18(6): 2194-2199.
No Suggested Reading articles found!