Please wait a minute...
Chin. Phys. B, 2020, Vol. 29(10): 108704    DOI: 10.1088/1674-1056/abb303
Special Issue: SPECIAL TOPIC — Modeling and simulations for the structures and functions of proteins and nucleic acids
SPECIAL TOPIC—Modeling and simulations for the structures and functions of proteins and nucleic acids Prev   Next  

Computational prediction of RNA tertiary structures using machine learning methods

Bin Huang(黄斌)1,2, Yuanyang Du(杜渊洋)1,2, Shuai Zhang(张帅)1,2, Wenfei Li(李文飞)1,2, Jun Wang (王骏)1,2, and Jian Zhang(张建)1,2,
1 National Laboratory of Solid State Microstructures, School of Physics, Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
2 Institute for Brain Sciences, Kuang Yaming Honors School, Nanjing University, Nanjing 210093, China
Abstract  

RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, and facilitating the design of new RNAs. Machine learning (ML) techniques have made tremendous progress in many fields in the past few years. Although their usage in protein-related fields has a long history, the use of ML methods in predicting RNA tertiary structures is new and rare. Here, we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation, the difficulties and potentials of these approaches when applied in the field.

Keywords:  RNA structure prediction      RNA scoring function      knowledge-based potentials      machine learning      convolutional neural networks  
Received:  27 June 2020      Revised:  22 August 2020      Accepted manuscript online:  27 August 2020
PACS:  87.15.B- (Structure of biomolecules)  
  87.14.gn (RNA)  
  07.05.Mh (Neural networks, fuzzy logic, artificial intelligence)  
Corresponding Authors:  Corresponding author. E-mail: jzhang@nju.edu.cn   
About author: 
†Corresponding author. E-mail: jzhang@nju.edu.cn
* Project supported by the National Natural Science Foundation of China (Grant Nos. 11774158, 11974173, 11774157, and 11934008).

Cite this article: 

Bin Huang(黄斌), Yuanyang Du(杜渊洋), Shuai Zhang(张帅), Wenfei Li(李文飞), Jun Wang (王骏), and Jian Zhang(张建)† Computational prediction of RNA tertiary structures using machine learning methods 2020 Chin. Phys. B 29 108704

Fig. 1.  

The architecture of the multilayer perceptron used in the work.[23] It contains a single hidden layer. The inputs are structural features, and the output is a score that indicates the quality of the structural candidates.

Fig. 2.  

The architecture of the CNN network in this work.[26] Note that not all convolutional layers are shown due to space limitations. Each cube represents a 3D image. The input layer has three channels, similar to the RGB channels in 2D images. The output is a single score, indicating the likeness of the input structure to the native structure.

3dRNAscore KB RASP Rosetta CNN model
Dataset-I 84/85 80/85 79/85 53/85 62/85
Dataset-II 17/20 20/20 12/20 12/20 19/20
Dataset-III 5/18 1/18 4/18 13/18
Table 1.  

The performance of different scoring functions. In each cell, the first number is the number of RNAs that are correctly identified, and the second is the total RNAs in the dataset.[26] The bold number indicates the best one among the same dataset.

[1]
Mercer T R, Dinger M E, Mattick J S 2009 Nat. Rev. Genetics 10 155 DOI: 10.1038/nrg2521
[2]
Geisler S, Coller J 2013 Nat. Rev. Mol. Cell Biol. 14 699 DOI: 10.1038/nrm3679
[3]
Cech T R, Steitz J A 2014 Cell 157 77 DOI: 10.1016/j.cell.2014.03.008
[4]
Morris K V, Mattick J S 2014 Nat. Rev. Genetics 15 423 DOI: 10.1038/nrg3722
[5]
Anastasiadou E, Jacob L S, Slack F J 2018 Nat. Rev. Cancer 18 5 DOI: 10.1038/nrc.2017.99
[6]
Miao Z, Adamiak R W, Antczak M et al. 2017 RNA 23 655 DOI: 10.1261/rna.060368.116
[7]
Chen S J 2008 Annu. Rev. Biophys. 37 197 DOI: 10.1146/annurev.biophys.37.032807.125957
[8]
Sun L Z, Zhang D, Chen S J 2017 Ann. Rev. Biophys. 46 227 DOI: 10.1146/annurev-biophys-070816-033920
[9]
Sponer J, Bussi G, Krepl M, Banas P, Bottaro S, Cunha R A, Gil-Ley A, Pinamonti G, Poblete S, Jurecka P, Walter N G, Otyepka M 2018 Chem. Rev. 118 4177 DOI: 10.1021/acs.chemrev.7b00427
[10]
Dans P D, Gallego D, Balaceanu A, Darre L, Gomez H, Orozco M 2019 Chem 5 51 DOI: 10.1016/j.chempr.2018.09.015
[11]
Shi Y Z, Wu Y Y, Wang F H, Tan Z J 2014 Chin. Phys. B 23 078701 DOI: 10.1088/1674-1056/23/7/078701
[12]
Goodfellow I, Bengio Y, Courville A 2016 Deep learning. Adaptive computation and machine learning Cambridge The MIT Press 197 200
[13]
Silver D, Huang A, Maddison C J, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham H, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, Hassabis D 2016 Nature 529 484 DOI: 10.1038/nature16961
[14]
Alipanahi B, Delong A, Weirauch M T, Frey B J 2015 Nat. Biotech. 33 831 DOI: 10.1038/nbt.3300
[15]
Zhou J, Troyanskaya O G 2015 Nat. Methods 12 931 DOI: 10.1038/nmeth.3547
[16]
Carleo G, Troyer M 2017 Science 355 602 DOI: 10.1126/science.aag2302
[17]
Carrasquilla J, Melko R G 2017 Nat. Phys. 13 431 DOI: 10.1038/nphys4035
[18]
Yonemotoa H, Asai K, Hamada M 2015 Comput. Biol. Chem. 57 72 DOI: 10.1016/j.compbiolchem.2015.02.002
[19]
Ray S S, Pal S K 2013 IEEEACM Trans. Compt. Biol. Bioinformatics 10 1 DOI: 10.1109/TCBB.2012.159
[20]
Koessler D R, Knisley D, Knisley J, Haynes T 2010 BMC Bioinformatics 11 S21 DOI: 10.1186/1471-2105-11-S6-S21
[21]
Tan Y L, Feng C J, Jin L, Shi Y Z, Zhang W B, Tan Z J 2019 RNA 25 793 DOI: 10.1261/rna.069872.118
[22]
Yang Y, Gu Q, Zhang B G, Shi Y Z, Shao Z G 2018 Chin. Phys. B 27 038701 DOI: 10.1088/1674-1056/27/3/038701
[23]
Wang Y Z, Li J, Zhang S, Huang B, Yao G, Zhang J 2019 Molecular Biol. 53 118 DOI: 10.1134/S0026893319010175
[24]
Tsai J, Bonneau R, Morozov A V, Kuhlman B, Rohl C A, Baker D 2003 Proteins 53 76 DOI: 10.1002/(ISSN)1097-0134
[25]
Capriotti E, Norambuena T, Marti-Renom M A, Melo F 2011 Bioinformatics 27 1086 DOI: 10.1093/bioinformatics/btr093
[26]
Li J, Zhu W, Wang J, Li W F, Gong S, Zhang J, Wang W 2018 Plos Comput. Biol. 14 e1006514 DOI: https://arxiv.org/abs/1409.1556v6
[27]
Simonyan K, Zisserman A 2014 arXiv:1409.1556v6
[28]
Das R, Baker D 2007 Proc. Natl. Acad. Sci. USA 104 14664 DOI: 10.1073/pnas.0703836104
[29]
Das R, Karanicolas J, Baker D 2010 Nature Methods 7 291 DOI: 10.1038/nmeth.1433
[30]
Bernauer J, Huang X H, Sim A, Levitt M 2011 RNA 17 1066 DOI: 10.1261/rna.2543711
[31]
Cruz J A, Blanchet M F, Boniecki M et al. 2012 RNA 18 610 DOI: 10.1261/rna.031054.111
[32]
Miao Z, Adamiak R W, Blanchet M F et al. 2015 RNA 21 1066 DOI: 10.1261/rna.049502.114
[33]
Wang J, Zhao Y J, Zhu C Y, Xiao Y 2015 Nuc. Acids Res. 43 e63 DOI: 10.1093/nar/gkv141
[34]
Frellsen J, Moltke I, Thiim M, Mardia K V, Ferkinghoff-Borg J, Hamelryck T 2009 Plos Comput. Biol. 5 e1000406 DOI: 10.1371/journal.pcbi.1000406
[35]
Wang Z, Xu J 2011 Bioinformatics 27 i102 DOI: 10.1093/bioinformatics/btr232
[36]
Miao Z, Westhof E 2017 Annu. Rev. Biophys. 46 483 DOI: 10.1146/annurev-biophys-070816-034125
[37]
Cruz J A, Westhof E 2011 Nature Methods 8 513 DOI: 10.1038/nmeth.1603
[38]
Theis C, Siederdissen C H, Hofacke I L, Gorodki J 2013 Nuc. Acids Res. 41 9999 DOI: 10.1093/nar/gkt795
[39]
Zirbel C, Roll J, Sweeney B A, Petrov A I, Pirrung M, Leontis N B 2015 Nuc. Acids Res. 43 7504 DOI: 10.1093/nar/gkv651
[40]
Theis C, Zirbel C L, Siederdissen C H, Anthon C, Hofacker I L, Nielsen H, Gorodkin J 2015 PLOS One 10 e0139900 DOI: 10.1371/journal.pone.0139900
[41]
Manning G S 2007 J. Phys. Chem. B. 111 8554 DOI: 10.1021/jp0670844
[42]
Baker N A 2005 Curr. Opin. Struct. Biol. 15 137 DOI: 10.1016/j.sbi.2005.02.001
[43]
Xiong G, Xi K, Zhang X, Tan Z J 2018 Chin. Phys. B 27 018203 DOI: 10.1088/1674-1056/27/1/018203
[44]
Tan Z J, Chen S J 2005 J. Chem. Phys. 122 44903 DOI: 10.1063/1.1842059
[45]
Tan Z J, Chen S J 2006 Biophys. J. 90 1175 DOI: 10.1529/biophysj.105.070904
[46]
Tan Z J, Chen S J 2010 Biophys. J. 99 1565 DOI: 10.1016/j.bpj.2010.06.029
[47]
Tan Z J, Chen S J 2011 Biophys. J. 101 176 DOI: 10.1016/j.bpj.2011.05.050
[48]
Shi Y Z, Jin L, Feng C J, Tan Y L, Tan Z J 2018 Plos Comput. Biol. 14 e1006222 DOI: 10.1371/journal.pcbi.1006222
[49]
Jin L, Tan Y L, Wu Y, Wang X, Shi Y Z, Tan Z J 2019 RNA 25 1532 DOI: 10.1261/rna.071662.119
[50]
Wang J M, Cieplak P, Li J, Wang J, Cai Q, Hsieh M J, Lei H X, Luo R, Duan Y 2011 J. Phys. Chem. B 115 3100 DOI: 10.1021/jp1121382
[51]
Li Y, Li H, Pickard F C, Narayanan B, Sen F G, Chan M, Sankaranarayanan S, Brooks B R, Roux B 2017 J. Chem. Theory Comput. 13 4492 DOI: 10.1021/acs.jctc.7b00521
[52]
Bereau T, DiStasio R A, Tkatchenko A, Lilienfeld O A 2018 J. Chem. Phys. 148 241706 DOI: 10.1063/1.5009502
[53]
Wang H, Yang W 2018 J. Phys. Chem. Lett. 9 3232 DOI: 10.1021/acs.jpclett.8b01131
[54]
Popelier P L A 2016 Physica Scripta 91 033007 DOI: 10.1088/0031-8949/91/3/033007
[55]
Hanson J, Paliwal K, Litfin T, Yang Y, Zhou Y Q 2018 Bioinformatics 34 4039 DOI: 10.1093/bioinformatics/bty481
[56]
Wang S, Sun S, Xu J B 2017 Proteins 86 67 DOI: 10.1002/prot.25377
[57]
Kandathil S M, Greener J G, Jones D T 2019 Proteins 87 1179 DOI: 10.1002/prot.v87.12
[58]
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J 2019 Proteins 87 1011 DOI: 10.1002/prot.v87.12
[59]
Senior A W, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Zidek A, Nelson A, Bridgland A, Penedones H, Petersen S, Simonyan K, Crossan S, Kohli P, Jones D T, Silver D, Kavukcuoglu K, Hassabis D 2019 Nature 577 706 DOI: 10.1038/s41586-019-1923-7
[60]
Weinreb C, Riesselman A J, Ingraham J B, Gross T, Sander C, Marks D S 2016 Cell 165 963 DOI: 10.1016/j.cell.2016.03.030
[61]
Leonardis E D, Lutz B, Ratz S, Cocco S, Monasson R, Schug A, Weigt M 2015 Nuc. Acids Res. 43 10444 DOI: 10.1093/nar/gkv932
[62]
Wang J, Mao K, Zhao Y J, Zeng C, Xiang J, Zhang Y, Xiao Y 2017 Nuc. Acids Res. 45 6299 DOI: 10.1093/nar/gkx386
[63]
Zhao Y, Huang Y, Gong Z, Wang Y, Man J, Xiao Y 2012 Scientific Reports 2 734 DOI: 10.1038/srep00734
[64]
Wang J, Xiao Y 2017 Current Protocols in bioinformatics 57 5 DOI: 10.1002/cpbi.21
[65]
Wang J, Wang J, Huang Y, Xiao Y 2019 Intern. J . Mol. Sci. 20 4116 DOI: 10.3390/ijms20174116
[66]
He X L, Li S M, Ou X J, Wang J, Xiao Y 2019 Comm. inform. syst. 19 279 DOI: 10.4310/CIS.2019.v19.n3.a3
[67]
Singh J, Hanson J, Paliwal K, Zhou Y Q 2019 Nat. Commun. 10 5407 DOI: 10.1038/s41467-019-13395-9
[68]
Zhang H, Zhang Q, Ju F, Zhu J, Gao Y, Xie Z, Deng M, Sun S, Zheng W M, Bu D B 2019 BMC Bioinformatics 20 537 DOI: 10.1186/s12859-019-3051-7
[69]
Bao L, Zhang X, Jin L, Tan Z J 2016 Chin. Phys. B 25 018703 DOI: 10.1088/1674-1056/25/1/018703
[70]
Kalvari I, Argasinska J, Quinones-Olvera N, Nawrocki E P, Rivas E, Eddy S R, Bateman A, Finn R D, Petrov A 2018 Nuc. Acids Res. 46 D335 DOI: 10.1093/nar/gkx1038
[71]
Wang J X, Nelson Z K, Tirumala D, Soyer H, Leibo J Z, Munos R, Blundell C, Kumaran D, Botvinick M 2017 arXiv:1611.05763v3 DOI: 10.1145/3386252
[72]
Zhou Z H 2018 National Science Review 5 44 DOI: 10.1093/nsr/nwx106
[73]
Wang Y, Yao Q, Kwok J K, Ni L M 2020 ACM Computing Surveys 53 63 DOI: 10.1145/3386252
[1] Prediction of lattice thermal conductivity with two-stage interpretable machine learning
Jinlong Hu(胡锦龙), Yuting Zuo(左钰婷), Yuzhou Hao(郝昱州), Guoyu Shu(舒国钰), Yang Wang(王洋), Minxuan Feng(冯敏轩), Xuejie Li(李雪洁), Xiaoying Wang(王晓莹), Jun Sun(孙军), Xiangdong Ding(丁向东), Zhibin Gao(高志斌), Guimei Zhu(朱桂妹), Baowen Li(李保文). Chin. Phys. B, 2023, 32(4): 046301.
[2] The coupled deep neural networks for coupling of the Stokes and Darcy-Forchheimer problems
Jing Yue(岳靖), Jian Li(李剑), Wen Zhang(张文), and Zhangxin Chen(陈掌星). Chin. Phys. B, 2023, 32(1): 010201.
[3] Variational quantum simulation of thermal statistical states on a superconducting quantum processer
Xue-Yi Guo(郭学仪), Shang-Shu Li(李尚书), Xiao Xiao(效骁), Zhong-Cheng Xiang(相忠诚), Zi-Yong Ge(葛自勇), He-Kang Li(李贺康), Peng-Tao Song(宋鹏涛), Yi Peng(彭益), Zhan Wang(王战), Kai Xu(许凯), Pan Zhang(张潘), Lei Wang(王磊), Dong-Ning Zheng(郑东宁), and Heng Fan(范桁). Chin. Phys. B, 2023, 32(1): 010307.
[4] Machine learning potential aided structure search for low-lying candidates of Au clusters
Tonghe Ying(应通和), Jianbao Zhu(朱健保), and Wenguang Zhu(朱文光). Chin. Phys. B, 2022, 31(7): 078402.
[5] Data-driven modeling of a four-dimensional stochastic projectile system
Yong Huang(黄勇) and Yang Li(李扬). Chin. Phys. B, 2022, 31(7): 070501.
[6] Quantum algorithm for neighborhood preserving embedding
Shi-Jie Pan(潘世杰), Lin-Chun Wan(万林春), Hai-Ling Liu(刘海玲), Yu-Sen Wu(吴宇森), Su-Juan Qin(秦素娟), Qiao-Yan Wen(温巧燕), and Fei Gao(高飞). Chin. Phys. B, 2022, 31(6): 060304.
[7] Evaluation of performance of machine learning methods in mining structure—property data of halide perovskite materials
Ruoting Zhao(赵若廷), Bangyu Xing(邢邦昱), Huimin Mu(穆慧敏), Yuhao Fu(付钰豪), and Lijun Zhang(张立军). Chin. Phys. B, 2022, 31(5): 056302.
[8] Quantum partial least squares regression algorithm for multiple correlation problem
Yan-Yan Hou(侯艳艳), Jian Li(李剑), Xiu-Bo Chen(陈秀波), and Yuan Tian(田源). Chin. Phys. B, 2022, 31(3): 030304.
[9] RNAGCN: RNA tertiary structure assessment with a graph convolutional network
Chengwei Deng(邓成伟), Yunxin Tang(唐蕴芯), Jian Zhang(张建), Wenfei Li(李文飞), Jun Wang(王骏), and Wei Wang(王炜). Chin. Phys. B, 2022, 31(11): 118702.
[10] Dynamical learning of non-Markovian quantum dynamics
Jintao Yang(杨锦涛), Junpeng Cao(曹俊鹏), and Wen-Li Yang(杨文力). Chin. Phys. B, 2022, 31(1): 010314.
[11] Quantitative structure-plasticity relationship in metallic glass: A machine learning study
Yicheng Wu(吴义成), Bin Xu(徐斌), Yitao Sun(孙奕韬), and Pengfei Guan(管鹏飞). Chin. Phys. B, 2021, 30(5): 057103.
[12] Quantum annealing for semi-supervised learning
Yu-Lin Zheng(郑玉鳞), Wen Zhang(张文), Cheng Zhou(周诚), and Wei Geng(耿巍). Chin. Phys. B, 2021, 30(4): 040306.
[13] Restricted Boltzmann machine: Recent advances and mean-field theory
Aurélien Decelle, Cyril Furtlehner. Chin. Phys. B, 2021, 30(4): 040202.
[14] Stability analysis of hydro-turbine governing system based on machine learning
Yuansheng Chen(陈元盛) and Fei Tong(仝飞). Chin. Phys. B, 2021, 30(12): 120509.
[15] Fundamental band gap and alignment of two-dimensional semiconductors explored by machine learning
Zhen Zhu(朱震), Baojuan Dong(董宝娟), Huaihong Guo(郭怀红), Teng Yang(杨腾), Zhidong Zhang(张志东). Chin. Phys. B, 2020, 29(4): 046101.
No Suggested Reading articles found!