Please wait a minute...
Chin. Phys. B, 2020, Vol. 29(10): 108704    DOI: 10.1088/1674-1056/abb303
Special Issue: SPECIAL TOPIC — Modeling and simulations for the structures and functions of proteins and nucleic acids
Topical Review—Modeling and simulations for the structures and functions of proteins and nucleic acids Prev   Next  

Computational prediction of RNA tertiary structures using machine learning methods

Bin Huang(黄斌)1,2, Yuanyang Du(杜渊洋)1,2, Shuai Zhang(张帅)1,2, Wenfei Li(李文飞)1,2, Jun Wang (王骏)1,2, and Jian Zhang(张建)1,2,
1 National Laboratory of Solid State Microstructures, School of Physics, Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
2 Institute for Brain Sciences, Kuang Yaming Honors School, Nanjing University, Nanjing 210093, China
Abstract  

RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, and facilitating the design of new RNAs. Machine learning (ML) techniques have made tremendous progress in many fields in the past few years. Although their usage in protein-related fields has a long history, the use of ML methods in predicting RNA tertiary structures is new and rare. Here, we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation, the difficulties and potentials of these approaches when applied in the field.

Keywords:  RNA structure prediction      RNA scoring function      knowledge-based potentials      machine learning      convolutional neural networks  
Received:  27 June 2020      Revised:  22 August 2020      Published:  05 October 2020
PACS:  87.15.B- (Structure of biomolecules)  
  87.14.gn (RNA)  
  07.05.Mh (Neural networks, fuzzy logic, artificial intelligence)  
Corresponding Authors:  Corresponding author. E-mail: jzhang@nju.edu.cn   
About author: 
†Corresponding author. E-mail: jzhang@nju.edu.cn
* Project supported by the National Natural Science Foundation of China (Grant Nos. 11774158, 11974173, 11774157, and 11934008).

Cite this article: 

Bin Huang(黄斌), Yuanyang Du(杜渊洋), Shuai Zhang(张帅), Wenfei Li(李文飞), Jun Wang (王骏), and Jian Zhang(张建)† Computational prediction of RNA tertiary structures using machine learning methods 2020 Chin. Phys. B 29 108704

Fig. 1.  

The architecture of the multilayer perceptron used in the work.[23] It contains a single hidden layer. The inputs are structural features, and the output is a score that indicates the quality of the structural candidates.

Fig. 2.  

The architecture of the CNN network in this work.[26] Note that not all convolutional layers are shown due to space limitations. Each cube represents a 3D image. The input layer has three channels, similar to the RGB channels in 2D images. The output is a single score, indicating the likeness of the input structure to the native structure.

3dRNAscore KB RASP Rosetta CNN model
Dataset-I 84/85 80/85 79/85 53/85 62/85
Dataset-II 17/20 20/20 12/20 12/20 19/20
Dataset-III 5/18 1/18 4/18 13/18
Table 1.  

The performance of different scoring functions. In each cell, the first number is the number of RNAs that are correctly identified, and the second is the total RNAs in the dataset.[26] The bold number indicates the best one among the same dataset.

[1]
Mercer T R, Dinger M E, Mattick J S 2009 Nat. Rev. Genetics 10 155 DOI: 10.1038/nrg2521
[2]
Geisler S, Coller J 2013 Nat. Rev. Mol. Cell Biol. 14 699 DOI: 10.1038/nrm3679
[3]
Cech T R, Steitz J A 2014 Cell 157 77 DOI: 10.1016/j.cell.2014.03.008
[4]
Morris K V, Mattick J S 2014 Nat. Rev. Genetics 15 423 DOI: 10.1038/nrg3722
[5]
Anastasiadou E, Jacob L S, Slack F J 2018 Nat. Rev. Cancer 18 5 DOI: 10.1038/nrc.2017.99
[6]
Miao Z, Adamiak R W, Antczak M et al. 2017 RNA 23 655 DOI: 10.1261/rna.060368.116
[7]
Chen S J 2008 Annu. Rev. Biophys. 37 197 DOI: 10.1146/annurev.biophys.37.032807.125957
[8]
Sun L Z, Zhang D, Chen S J 2017 Ann. Rev. Biophys. 46 227 DOI: 10.1146/annurev-biophys-070816-033920
[9]
Sponer J, Bussi G, Krepl M, Banas P, Bottaro S, Cunha R A, Gil-Ley A, Pinamonti G, Poblete S, Jurecka P, Walter N G, Otyepka M 2018 Chem. Rev. 118 4177 DOI: 10.1021/acs.chemrev.7b00427
[10]
Dans P D, Gallego D, Balaceanu A, Darre L, Gomez H, Orozco M 2019 Chem 5 51 DOI: 10.1016/j.chempr.2018.09.015
[11]
Shi Y Z, Wu Y Y, Wang F H, Tan Z J 2014 Chin. Phys. B 23 078701 DOI: 10.1088/1674-1056/23/7/078701
[12]
Goodfellow I, Bengio Y, Courville A 2016 Deep learning. Adaptive computation and machine learning Cambridge The MIT Press 197 200
[13]
Silver D, Huang A, Maddison C J, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham H, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, Hassabis D 2016 Nature 529 484 DOI: 10.1038/nature16961
[14]
Alipanahi B, Delong A, Weirauch M T, Frey B J 2015 Nat. Biotech. 33 831 DOI: 10.1038/nbt.3300
[15]
Zhou J, Troyanskaya O G 2015 Nat. Methods 12 931 DOI: 10.1038/nmeth.3547
[16]
Carleo G, Troyer M 2017 Science 355 602 DOI: 10.1126/science.aag2302
[17]
Carrasquilla J, Melko R G 2017 Nat. Phys. 13 431 DOI: 10.1038/nphys4035
[18]
Yonemotoa H, Asai K, Hamada M 2015 Comput. Biol. Chem. 57 72 DOI: 10.1016/j.compbiolchem.2015.02.002
[19]
Ray S S, Pal S K 2013 IEEEACM Trans. Compt. Biol. Bioinformatics 10 1 DOI: 10.1109/TCBB.2012.159
[20]
Koessler D R, Knisley D, Knisley J, Haynes T 2010 BMC Bioinformatics 11 S21 DOI: 10.1186/1471-2105-11-S6-S21
[21]
Tan Y L, Feng C J, Jin L, Shi Y Z, Zhang W B, Tan Z J 2019 RNA 25 793 DOI: 10.1261/rna.069872.118
[22]
Yang Y, Gu Q, Zhang B G, Shi Y Z, Shao Z G 2018 Chin. Phys. B 27 038701 DOI: 10.1088/1674-1056/27/3/038701
[23]
Wang Y Z, Li J, Zhang S, Huang B, Yao G, Zhang J 2019 Molecular Biol. 53 118 DOI: 10.1134/S0026893319010175
[24]
Tsai J, Bonneau R, Morozov A V, Kuhlman B, Rohl C A, Baker D 2003 Proteins 53 76 DOI: 10.1002/(ISSN)1097-0134
[25]
Capriotti E, Norambuena T, Marti-Renom M A, Melo F 2011 Bioinformatics 27 1086 DOI: 10.1093/bioinformatics/btr093
[26]
Li J, Zhu W, Wang J, Li W F, Gong S, Zhang J, Wang W 2018 Plos Comput. Biol. 14 e1006514 DOI: https://arxiv.org/abs/1409.1556v6
[27]
Simonyan K, Zisserman A 2014 arXiv:1409.1556v6
[28]
Das R, Baker D 2007 Proc. Natl. Acad. Sci. USA 104 14664 DOI: 10.1073/pnas.0703836104
[29]
Das R, Karanicolas J, Baker D 2010 Nature Methods 7 291 DOI: 10.1038/nmeth.1433
[30]
Bernauer J, Huang X H, Sim A, Levitt M 2011 RNA 17 1066 DOI: 10.1261/rna.2543711
[31]
Cruz J A, Blanchet M F, Boniecki M et al. 2012 RNA 18 610 DOI: 10.1261/rna.031054.111
[32]
Miao Z, Adamiak R W, Blanchet M F et al. 2015 RNA 21 1066 DOI: 10.1261/rna.049502.114
[33]
Wang J, Zhao Y J, Zhu C Y, Xiao Y 2015 Nuc. Acids Res. 43 e63 DOI: 10.1093/nar/gkv141
[34]
Frellsen J, Moltke I, Thiim M, Mardia K V, Ferkinghoff-Borg J, Hamelryck T 2009 Plos Comput. Biol. 5 e1000406 DOI: 10.1371/journal.pcbi.1000406
[35]
Wang Z, Xu J 2011 Bioinformatics 27 i102 DOI: 10.1093/bioinformatics/btr232
[36]
Miao Z, Westhof E 2017 Annu. Rev. Biophys. 46 483 DOI: 10.1146/annurev-biophys-070816-034125
[37]
Cruz J A, Westhof E 2011 Nature Methods 8 513 DOI: 10.1038/nmeth.1603
[38]
Theis C, Siederdissen C H, Hofacke I L, Gorodki J 2013 Nuc. Acids Res. 41 9999 DOI: 10.1093/nar/gkt795
[39]
Zirbel C, Roll J, Sweeney B A, Petrov A I, Pirrung M, Leontis N B 2015 Nuc. Acids Res. 43 7504 DOI: 10.1093/nar/gkv651
[40]
Theis C, Zirbel C L, Siederdissen C H, Anthon C, Hofacker I L, Nielsen H, Gorodkin J 2015 PLOS One 10 e0139900 DOI: 10.1371/journal.pone.0139900
[41]
Manning G S 2007 J. Phys. Chem. B. 111 8554 DOI: 10.1021/jp0670844
[42]
Baker N A 2005 Curr. Opin. Struct. Biol. 15 137 DOI: 10.1016/j.sbi.2005.02.001
[43]
Xiong G, Xi K, Zhang X, Tan Z J 2018 Chin. Phys. B 27 018203 DOI: 10.1088/1674-1056/27/1/018203
[44]
Tan Z J, Chen S J 2005 J. Chem. Phys. 122 44903 DOI: 10.1063/1.1842059
[45]
Tan Z J, Chen S J 2006 Biophys. J. 90 1175 DOI: 10.1529/biophysj.105.070904
[46]
Tan Z J, Chen S J 2010 Biophys. J. 99 1565 DOI: 10.1016/j.bpj.2010.06.029
[47]
Tan Z J, Chen S J 2011 Biophys. J. 101 176 DOI: 10.1016/j.bpj.2011.05.050
[48]
Shi Y Z, Jin L, Feng C J, Tan Y L, Tan Z J 2018 Plos Comput. Biol. 14 e1006222 DOI: 10.1371/journal.pcbi.1006222
[49]
Jin L, Tan Y L, Wu Y, Wang X, Shi Y Z, Tan Z J 2019 RNA 25 1532 DOI: 10.1261/rna.071662.119
[50]
Wang J M, Cieplak P, Li J, Wang J, Cai Q, Hsieh M J, Lei H X, Luo R, Duan Y 2011 J. Phys. Chem. B 115 3100 DOI: 10.1021/jp1121382
[51]
Li Y, Li H, Pickard F C, Narayanan B, Sen F G, Chan M, Sankaranarayanan S, Brooks B R, Roux B 2017 J. Chem. Theory Comput. 13 4492 DOI: 10.1021/acs.jctc.7b00521
[52]
Bereau T, DiStasio R A, Tkatchenko A, Lilienfeld O A 2018 J. Chem. Phys. 148 241706 DOI: 10.1063/1.5009502
[53]
Wang H, Yang W 2018 J. Phys. Chem. Lett. 9 3232 DOI: 10.1021/acs.jpclett.8b01131
[54]
Popelier P L A 2016 Physica Scripta 91 033007 DOI: 10.1088/0031-8949/91/3/033007
[55]
Hanson J, Paliwal K, Litfin T, Yang Y, Zhou Y Q 2018 Bioinformatics 34 4039 DOI: 10.1093/bioinformatics/bty481
[56]
Wang S, Sun S, Xu J B 2017 Proteins 86 67 DOI: 10.1002/prot.25377
[57]
Kandathil S M, Greener J G, Jones D T 2019 Proteins 87 1179 DOI: 10.1002/prot.v87.12
[58]
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J 2019 Proteins 87 1011 DOI: 10.1002/prot.v87.12
[59]
Senior A W, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Zidek A, Nelson A, Bridgland A, Penedones H, Petersen S, Simonyan K, Crossan S, Kohli P, Jones D T, Silver D, Kavukcuoglu K, Hassabis D 2019 Nature 577 706 DOI: 10.1038/s41586-019-1923-7
[60]
Weinreb C, Riesselman A J, Ingraham J B, Gross T, Sander C, Marks D S 2016 Cell 165 963 DOI: 10.1016/j.cell.2016.03.030
[61]
Leonardis E D, Lutz B, Ratz S, Cocco S, Monasson R, Schug A, Weigt M 2015 Nuc. Acids Res. 43 10444 DOI: 10.1093/nar/gkv932
[62]
Wang J, Mao K, Zhao Y J, Zeng C, Xiang J, Zhang Y, Xiao Y 2017 Nuc. Acids Res. 45 6299 DOI: 10.1093/nar/gkx386
[63]
Zhao Y, Huang Y, Gong Z, Wang Y, Man J, Xiao Y 2012 Scientific Reports 2 734 DOI: 10.1038/srep00734
[64]
Wang J, Xiao Y 2017 Current Protocols in bioinformatics 57 5 DOI: 10.1002/cpbi.21
[65]
Wang J, Wang J, Huang Y, Xiao Y 2019 Intern. J . Mol. Sci. 20 4116 DOI: 10.3390/ijms20174116
[66]
He X L, Li S M, Ou X J, Wang J, Xiao Y 2019 Comm. inform. syst. 19 279 DOI: 10.4310/CIS.2019.v19.n3.a3
[67]
Singh J, Hanson J, Paliwal K, Zhou Y Q 2019 Nat. Commun. 10 5407 DOI: 10.1038/s41467-019-13395-9
[68]
Zhang H, Zhang Q, Ju F, Zhu J, Gao Y, Xie Z, Deng M, Sun S, Zheng W M, Bu D B 2019 BMC Bioinformatics 20 537 DOI: 10.1186/s12859-019-3051-7
[69]
Bao L, Zhang X, Jin L, Tan Z J 2016 Chin. Phys. B 25 018703 DOI: 10.1088/1674-1056/25/1/018703
[70]
Kalvari I, Argasinska J, Quinones-Olvera N, Nawrocki E P, Rivas E, Eddy S R, Bateman A, Finn R D, Petrov A 2018 Nuc. Acids Res. 46 D335 DOI: 10.1093/nar/gkx1038
[71]
Wang J X, Nelson Z K, Tirumala D, Soyer H, Leibo J Z, Munos R, Blundell C, Kumaran D, Botvinick M 2017 arXiv:1611.05763v3 DOI: 10.1145/3386252
[72]
Zhou Z H 2018 National Science Review 5 44 DOI: 10.1093/nsr/nwx106
[73]
Wang Y, Yao Q, Kwok J K, Ni L M 2020 ACM Computing Surveys 53 63 DOI: 10.1145/3386252
[1] Fundamental band gap and alignment of two-dimensional semiconductors explored by machine learning
Zhen Zhu(朱震), Baojuan Dong(董宝娟), Huaihong Guo(郭怀红), Teng Yang(杨腾), Zhidong Zhang(张志东). Chin. Phys. B, 2020, 29(4): 046101.
[2] Machine learning in materials design: Algorithm and application
Zhilong Song(宋志龙), Xiwen Chen(陈曦雯), Fanbin Meng(孟繁斌), Guanjian Cheng(程观剑), Chen Wang(王陈), Zhongti Sun(孙中体), and Wan-Jian Yin(尹万健). Chin. Phys. B, 2020, 29(11): 116103.
[3] Methods and applications of RNA contact prediction
Huiwen Wang(王慧雯) and Yunjie Zhao(赵蕴杰)†. Chin. Phys. B, 2020, 29(10): 108708.
[4] Dielectric or plasmonic Mie object at air-liquid interface: The transferred and the traveling momenta of photon
M R C Mahdy, Hamim Mahmud Rivy, Ziaur Rahman Jony, Nabila Binte Alam, Nabila Masud, Golam Dastegir Al Quaderi, Ibraheem Muhammad Moosa, Chowdhury Mofizur Rahman, M Sohel Rahman. Chin. Phys. B, 2020, 29(1): 014211.
[5] Machine learning technique for prediction of magnetocaloric effect in La(Fe, Si/Al)13-based materials
Bo Zhang(张博), Xin-Qi Zheng(郑新奇), Tong-Yun Zhao(赵同云), Feng-Xia Hu(胡凤霞), Ji-Rong Sun(孙继荣), Bao-Gen Shen(沈保根). Chin. Phys. B, 2018, 27(6): 067503.
[6] Composition design for (PrNd-La–Ce)2Fe14B melt-spun magnets by machine learning technique
Rui Li(李锐), Yao Liu(刘瑶), Shu-Lan Zuo(左淑兰), Tong-Yun Zhao(赵同云), Feng-Xia Hu(胡凤霞), Ji-Rong Sun(孙继荣), Bao-Gen Shen(沈保根). Chin. Phys. B, 2018, 27(4): 047501.
[7] Optimizing the atom types of proteins through iterative knowledge-based potentials
Xin-Xiang Wang(汪心享), Sheng-You Huang(黄胜友). Chin. Phys. B, 2018, 27(2): 020503.
[8] Nuclear magnetic resonance for quantum computing: Techniques and recent achievements
Tao Xin(辛涛), Bi-Xue Wang(王碧雪), Ke-Ren Li(李可仁), Xiang-Yu Kong(孔祥宇), Shi-Jie Wei(魏世杰), Tao Wang(王涛), Dong Ruan(阮东), Gui-Lu Long(龙桂鲁). Chin. Phys. B, 2018, 27(2): 020308.
[9] Accomplishment and challenge of materials database toward big data
Yibin Xu(徐一斌). Chin. Phys. B, 2018, 27(11): 118901.
[10] Exploring the relationship between fractal features and bacterial essential genes
Yong-Ming Yu(余永明), Li-Cai Yang(杨立才), Qian Zhou(周茜), Lu-Lu Zhao(赵璐璐), Zhi-Ping Liu(刘治平). Chin. Phys. B, 2016, 25(6): 060503.
[11] Knowledge-based potentials in bioinformatics: From a physicist's viewpoint
Zheng Wei-Mou. Chin. Phys. B, 2015, 24(12): 128701.
[12] RNA structure prediction:Progress and perspective
Shi Ya-Zhou, Wu Yuan-Yan, Wang Feng-Hua, Tan Zhi-Jie. Chin. Phys. B, 2014, 23(7): 078701.
No Suggested Reading articles found!