|
Special Issue:
|
| SPECIAL TOPIC — A celebration of the 90th Anniversary of the Birth of Bolin Hao |
Prev
Next
|
|
|
RLsite: Integrating 3D-CNN and BiLSTM for RNA-ligand binding site prediction |
| Yan Zou(邹艳), Lang Yang(杨浪), Yanhui Liu(刘艳辉), and Yuyu Feng(冯玉宇)† |
| School of Physics, Guizhou University, Guiyang 550000, China |
|
|
|
|
Abstract Accurate identification of RNA-ligand binding sites is essential for elucidating RNA function and advancing structure-based drug discovery. Here, we present RLsite, a novel deep learning framework that integrates energy-, structure- and sequence-based features to predict nucleotide-level binding sites with high accuracy. RLsite leverages energy-based three-dimensional representations, obtained from atomic probe interactions using a pre-trained ITScore-NL potential, and models their contextual features through a 3D convolutional neural network (3D-CNN) augmented with self-attention. In parallel, structure-based features, including network properties, Laplacian norm, and solvent-accessible surface area, together with sequence-based evolutionary constraint scores, are mapped along the RNA sequence and used as sequential descriptors. These descriptors are modeled using a bidirectional long short-term memory (BiLSTM) network enhanced with multi-head self-attention. By effectively fusing these complementary modalities, RLsite achieves robust and precise binding site prediction. Extensive evaluations across four diverse RNA-ligand benchmark datasets demonstrate that RLsite consistently outperforms state-of-the-art methods in terms of precision, recall, Matthews correlation coefficient (MCC), area under the curve (AUC), and overall robustness. Notably, on a particularly challenging test set composed of RNA structures containing junctions, RLsite surpasses the second-best method by 7.3% in precision, 3.4% in recall, 7.5% in MCC, and 10.8% in AUC, highlighting its potential as a powerful tool for RNA-targeted molecular design.
|
Received: 06 May 2025
Revised: 22 June 2025
Accepted manuscript online: 02 July 2025
|
|
PACS:
|
87.14.gn
|
(RNA)
|
| |
87.15.A-
|
(Theory, modeling, and computer simulation)
|
| |
87.15.B-
|
(Structure of biomolecules)
|
|
| Fund: Project supported by the National Natural Science Foundation of China (Grant No. 12204118) and the Guizhou University Talent Fund (Grant No. [2022]30). |
Corresponding Authors:
Yuyu Feng
E-mail: fengyy@gzu.edu.cn
|
Cite this article:
Yan Zou(邹艳), Lang Yang(杨浪), Yanhui Liu(刘艳辉), and Yuyu Feng(冯玉宇) RLsite: Integrating 3D-CNN and BiLSTM for RNA-ligand binding site prediction 2025 Chin. Phys. B 34 088709
|
[1] Cao Y, Liu H C, Shannon S Lu, Krysten A Jones, Anitha P Govind, Okunola Jeyifous, Christine Q Simmons, Negar Tabatabaei,William N Green, Jimmy. L Holder Jr., Soroush Tahmasebi, Alfred L George Jr. and Bryan C Dickinson 2023 Nat. Commun. 14 6827 [2] Singh S, Sinha T and Panda A C 2024 Wiley Interdisciplinary Reviews: RNA 15 e1820 [3] Delaunay S, Helm M and Frye M 2024 Nat. Rev. Genet. 25 104 [4] Peng X, Liao W, Lin X, Lilley D M J and Huang L 2023 Nucleic Acids Res. 51 2904 [5] Truong L, Kooshapur H, Dey S K, Li X, Tjandra N, Jaffrey S R and Ferré-D’Amaré A R 2022 Nat. Chem. Biol. 18 191 [6] Li T, He J H, Cao H, Zhang Y, Chen J, Xiao Y and Huang S Y 2025 Nat. Biotechnol. 43 97 [7] Yang C, Song X, Feng Y, Zhao G and Liu Y 2023 J. Phys.: Condens. Matter 35 265101 [8] Song L, Yu S X,Wang X X, Tan Y L and Tan Z J 2022 Commun. Theor. Phys. 74 075602 [9] He X L, Wang J, Wang J and Xiao Y 2020 Chin. Phys. B 29 078702 [10] Zhu Y R, Zhu L Y, Wang X and Jin H C 2022 Cell Death Dis. 13 644 [11] Toden S and Goel A 2022 Br. J. Cancer 126 351 [12] Pardi N, Hogan M J, Porter F W and Weissman D 2018 Nat. Rev. Drug Discov. 17 261 [13] Disney M D 2019 J. Am. Chem. Soc. 141 6776 [14] Yu A M, Choi Y H and Tu M J 2020 Pharmacol Rev. 27 862 [15] Zhao R Y, Fu J H, Zhu L J, Chen Y and Liu B 2020 J. Hematol Oncol. 15 14 [16] de Souza Neto L R, Moreira-Filho J T, Neves B J, Maidana R L B R, Guimaräes A C R, Furnham N, Andrade C H and Silva F P Jr 2020 Front. Chem. 8 93 [17] Xu W and Kang C 2025 J. Med. Chem. 68 5000 [18] Ursu A, Childs-Disney J L, Andrews R J, O’Leary C A, Meyer S M, Angelbello A J, Moss W N and Disney M D 2020 Chem. Soc. Rev. 49 7252 [19] Velagapudi S P, Cameron M D, Haga C L, Rosenberg L H, Lafitte M, Duckett D R, Phinney D G and Disney M D 2016 Proc. Natl. Acad. Sci. USA 113 5898 [20] Feng Y Y, Yan Y M, He J H, Tao H Y, Wu Q L and Huang S Y 2022 Drug Discov Today 27 838 [21] Feng Y, Zhang K, Wu Q and Huang S Y 2021 J. Chem. Inf. Model. 61 4771 [22] Sun L Z, Jiang Y W, Zhou Y Z and Chen S J 2020 J. Chem. Theory Comput. 16 7173 [23] Detering C and Varani G 2004 J. Med. Chem. 47 4188 [24] Ruiz-Carmona S, Alvarez-Garcia D, Foloppe N, Beatriz Garmendia- Doval A, Juhos S, Schmidtke P, Barril X, Hubbard R E and David Morley S 2014 PLoS Comput. Biol. 10 e1003571 [25] Therese Lang P, Brozell S R, Mukherjee S, Pettersen E F, Meng E C, Thomas V, Rizzo R C, Case, James T L and Kuntz I D 2009 RNA 15 1219 [26] Guilbert C and James T L 2008 J. Chem. Inf. Model. 48 1257 [27] Zeng P, Li J, Ma W and Cui Q 2015 Sci. Rep. 5 9179 [28] Zeng P and Cui Q 2016 Sci. Rep. 6 19016 [29] Wang K, Jian Y, Wang H, Zeng C and Zhao Y 2018 Bioinformatics 34 3131 [30] Su H, Peng Z and Yang J 2021 Bioinformatics 37 36 [31] Wang K, Zhou R, Wu Y and Li M 2023 Brief Bioinform 24 bbac486 [32] Gao J, Liu H, Zhuo C, Zeng C and Zhao Y 2024 J. Chem. Inf. Model. 64 6979 [33] Berman H M, Westbrook J, Feng Z, Gilliland G, Bhat T N, Weissig H, Shindyalov I N and Bourne P E 2000 Nucleic Acids Res. 28 235 [34] Gong S, Zhang C and Zhang Y 2019 Bioinformatics 35 4459 [35] Li W and Adam G 2006 Bioinformatics 22 1658 [36] Altschul S F, Gish W, Miller W, Myers E W and Lipman D J 1990 J. Mol. Biol. 215 403 [37] Fu L, Niu B, Zhu Z, Wu S and Li W 2012 Bioinformatics 28 3150 [38] Huang Y, Niu B, Gao Y, Fu L and Li W 2010 Bioinformatics 26 680 [39] Feng Y and Huang S Y 2020 J. Chem. Inf. Model. 60 6698 [40] Amitai G, Shemesh A, Sitbon E, et al. 2004 J. Mol. Biol. 344 1135 [41] Bonnel N and Marteau P F 2012 IEEEACM Transactions on Computational Biology and Bioinformatics 9 1451 [42] Sun S, Wu Q, Peng Z and Yang J 2019 Bioinformatics 35 1686 [43] Zhang Z, Scott Schwartz, LukasWagner andWebb Miller 2000 J. Comput. Biol. 7 203 [44] Chen K, Thomas Litfin, Jaswinder Singh, Zhan J and Zhou Y Q 2024 Genomics, Proteomics & Bioinformatics 22 qzae018 [45] Marks D S, Hopf T A and Sander C 2012 Nat. Biotechnol. 30 1072 [46] Jian Y, Wang X, Qiu J, Wang H, Liu Z, Zhao Y and Zeng C 2019 BMC Bioinformatics 20 497 [47] Ekeberg M, Lövkvist C, Lan Y, Weigt M and Aurell E 2013 Phys. Rev. E Stat Nonlin Soft Matter Phys. 87 012707 [48] Zhou P, Xie X, Lin Z and Yan S. 2024 IEEE Trans Pattern Anal Mach Intell. 46 6486 [49] He K M, Zhang X Y, Ren S Q and Sun J 2015 IEEE International Conference on Computer Vision (ICCV) 1026 [50] Flinders J, DeFina S C, Brackett D M, Baugh C, Wilson C and Dieckmann T 2004 Chembiochem. 5 62 [51] Fan P, Suri A K, Fiala R, Live D and Patel D J 1996 J. Mol. Biol. 258 480 [52] Jiang L, Majumdar A, Hu W, Jaishree TJ, Xu W and Patel DJ 1999 Structure 7 817 [53] Zeller M J, Favorov O, Li K, Nuthanakanti A, Hussein D, Michaud A, Lafontaine D A, Busan S, Serganov A, Aubé J and Weeks K M 2022 Proc. Natl. Acad. Sci. USA 119 e2122660119 [54] Biesiada M, Pachulska-Wieczorek K, Adamiak R W and Purzycka K J 2016 Methods 103 120 [55] Boniecki M J, Lach G, Dawson W K, Tomala K, Lukasz P, Soltysinski T, Rother K M and Bujnicki J M 2016 Nucleic Acids Res. 44 e63 [56] Eastman P, Galvelis R, Peláez R P, Abreu C R A, Farr S E, Gallicchio E, Gorenko A, Henry M M, Hu F, Huang J, Krämer A, Michel J, Mitchell J A, Pande V S, Rodrigues J P, Rodriguez-Guerra J, Simmonett A C, Singh S, Swails J, Turner P,Wang Y, Zhang I, Chodera J D, De Fabritiis G and Markland T E 2024 J. Phys. Chem. B 128 109 |
| No Suggested Reading articles found! |
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
Altmetric
|
|
blogs
Facebook pages
Wikipedia page
Google+ users
|
Online attention
Altmetric calculates a score based on the online attention an article receives. Each coloured thread in the circle represents a different type of online attention. The number in the centre is the Altmetric score. Social media and mainstream news media are the main sources that calculate the score. Reference managers such as Mendeley are also tracked but do not contribute to the score. Older articles often score higher because they have had more time to get noticed. To account for this, Altmetric has included the context data for other articles of a similar age.
View more on Altmetrics
|
|
|