Review of multimer protein–protein interaction complex topology and structure prediction
Daiwen Sun(孙黛雯)1, Shijie Liu(刘世婕)1, and Xinqi Gong(龚新奇)1,2,†
1Mathematics Intelligence Application Laboratory, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China 2Beijing Advanced Innovation Center for Structural Biology, Tshinghua University, Beijing 100094, China
Protein–protein interactions (PPI) are important for many biological processes. Theoretical understanding of the structurally determining factors of interaction sites will help to understand the underlying mechanism of protein–protein interactions. At the same time, understanding the complex structure of proteins helps to explore their function. And accurately predicting protein complexes from PPI networks helps us understand the relationship between proteins. In the past few decades, scholars have proposed many methods for predicting protein interactions and protein complex structures. In this review, we first briefly introduce the methods and servers for predicting protein interaction sites and interface residue pairs, and then introduce the protein complex structure prediction methods including template-based prediction and template-free prediction. Subsequently, this paper introduces the methods of predicting protein complexes from the PPI network and the method of predicting missing links in the PPI network. Finally, it briefly summarizes the application of machine/deep learning models in protein structure prediction and action site prediction.
Direct evolutionary coupling analysis (DCA), machine learning and deep learning methods
Interfacial residue pair prediction can help subsequent protein complex structure predictions, such as docking.Protein contact map prediction can help reconstruct the three-dimensional structure of protein complexes.
The accuracy of interface residues for prediction needs to be improved.
Using sequence or structure similarity to model protein complexes with known structures.
Template-based methods mainly reduce the possible structure by restricting the direction of protein binding. This method is more efficient than docking and can be applied to larger-scale protein complex prediction.
For proteins without a template, the structure of the complex cannot be predicted.
Protein complex prediction from PPI networks
Complex prediction based on PPI network clustering
The protein complex is part of a known PPI network, that is, the graph composed of protein complexes and their interactions is a subgraph of the PPI network.
Some of these methods only use the PPI network for clustering, and some use additional biological information, including structure, function, organization and co-evolution infornation, etc.
The proteins that may form complexes can only be picked out from the existing PPI network.
Complex interaction link prediction from PPI network
GGA; HAC; ECT; RWR; MDS; Link-weighted PPI
Methods to predict actual links in the network include public neighbors-based methods and distance-based methods.
This type of method predicts possible protein--protein interactions based on existing network information.
Public neighbors-based methods have limited effect on sparse networks.
Yu H, Braun P, Yildirim M A, Lemmens I, Venkatesan K, Sahalie J, Hirozane-Kishikawa T, Gebreab F, Li N, Simonis N, Hao T, Rual J F, Dricot A, Vazquez A, Murray R R, Simon C, Tardivo L, Tam S, Svrzikapa N, Fan C, de Smet A S, Motyl A, Hudson M E, Park J, Xin X, Cusick M E, Moore T, Boone C, Snyder M, Roth F P, Barabási A L, Tavernier J, Hill D E, Vidal M 2008 Science 322 104 DOI: 10.1126/science.1158684
[43]
Tarassov K, Messier V, Landry C R, Radinovic S, Serna Molina M M, Shames I, Malitskaya Y, Vogel J, Bussey H, Michnick S W 2008 Science 320 1465 DOI: 10.1126/science.1153878
[44]
Krogan N J, Cagney G, Yu H et al. 2006 Nature 440 637 DOI: 10.1038/nature04670
[45]
Gavin A C, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen L J, Bastuck S, Dümpelfeld B, Edelmann A, Heurtier M A, Hoffman V, Hoefert C, Klein K, Hudak M, Michon A M, Schelder M, Schirle M, Remor M, Rudi T, Hooper S, Bauer A, Bouwmeester T, Casari G, Drewes G, Neubauer G, Rick J M, Kuster B, Bork P, Russell R B, Superti-Furga G 2006 Nature 440 631 DOI: 10.1038/nature04532
Yu H, Kim P M, Sprecher E, Trifonov V, Gerstein M 2007 PLoS Comput. Biol. 3 e59 DOI: 10.1371/journal.pcbi.0030059
[48]
Jeong H, Mason S P, Barabási A L, Oltvai Z N 2001 Nature 411 41 DOI: 10.1038/35075138
[49]
Han J D J, Bertin N, Hao T, Goldberg D S, Berriz G F, Zhang L V, Dupuy D, Walhout A J M, Cusick M E, Roth F P, Vidal M 2004 Nature 430 88 DOI: 10.1038/nature02555
Srihari S, Yong C H, Wong L 2017 Computational prediction of protein complexes from protein interaction networks Association for Computing Machinery DOI: 10.1145/3064650
Wang H, Kakaradov B, Collins S R, Karotki L, Fiedler D, Shales M, Shokat K M, Walther T C, Krogan N J, Koller D 2009 Mol. Cell Proteomics 8 1361 DOI: 10.1074/mcp.M800490-MCP200
Senior A W, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Žídek A, Nelson A W, Bridgland A 2019 Proteins: Structure, Function, and Bioinformatics 87 1141 DOI: 10.1002/prot.v87.12
Singh G, Dhole K, Pai P P, Mondal S 2014 SPRINGS: prediction of protein–protein interaction sites using artificial neural networks Report No. 2167–9843 DOI: 10.7287/peerj.preprints.266v2
Altmetric calculates a score based on the online attention an article receives. Each coloured thread in the circle represents a different type of online attention. The number in the centre is the Altmetric score. Social media and mainstream news media are the main sources that calculate the score. Reference managers such as Mendeley are also tracked but do not contribute to the score. Older articles often score higher because they have had more time to get noticed. To account for this, Altmetric has included the context data for other articles of a similar age.