Chin. Phys. B, 2020, Vol. 29(10): 108707    DOI: 10.1088/1674-1056/abb659
Special Issue: SPECIAL TOPIC — Modeling and simulations for the structures and functions of proteins and nucleic acids
Review of multimer protein–protein interaction complex topology and structure prediction

Daiwen Sun(孙黛雯)1, Shijie Liu(刘世婕)1, and Xinqi Gong(龚新奇)1,2,
1 Mathematics Intelligence Application Laboratory, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
2 Beijing Advanced Innovation Center for Structural Biology, Tshinghua University, Beijing 100094, China

Protein–protein interactions (PPI) are important for many biological processes. Theoretical understanding of the structurally determining factors of interaction sites will help to understand the underlying mechanism of protein–protein interactions. At the same time, understanding the complex structure of proteins helps to explore their function. And accurately predicting protein complexes from PPI networks helps us understand the relationship between proteins. In the past few decades, scholars have proposed many methods for predicting protein interactions and protein complex structures. In this review, we first briefly introduce the methods and servers for predicting protein interaction sites and interface residue pairs, and then introduce the protein complex structure prediction methods including template-based prediction and template-free prediction. Subsequently, this paper introduces the methods of predicting protein complexes from the PPI network and the method of predicting missing links in the PPI network. Finally, it briefly summarizes the application of machine/deep learning models in protein structure prediction and action site prediction.

Keywords:  protein complex prediction      protein-protein interaction  
Received:  29 June 2020      Revised:  31 August 2020      Published:  05 October 2020
PACS: (Protein-protein interactions)  
  01.50.hv (Computer software and software reviews)  
* Project supported by the National Natural Science Foundation of China (Grant No. 31670725).

Daiwen Sun(孙黛雯), Shijie Liu(刘世婕), and Xinqi Gong(龚新奇)† Review of multimer protein–protein interaction complex topology and structure prediction 2020 Chin. Phys. B 29 108707

Fig. 1.  

The main content of protein interaction calculation.

Fig. 2.  

Statistics of protein multimers in PDB database.

Optimization algorithms Programs
Fast Fourier transformation ZDOCK; GRAMM; DOT; SmoothDock; ClusPro; MolFit; FTDock; 3D-Dock; PIPER; pyDock; HDOCK; SDOCK; HEX; FRODOCK; InterEvDock; MDockPP; CoDockPP; HSYMDOCK; SAM
Monte Carlo RosettaDock; ICM-DOCK; HADDOCK; ATTRACT
Genetic algorithm DARWIN; Multi-LZerD; AutoDock
Table 1.  

Classification of optimization algorithms applied by protein protein docking approached.

Fig. 3.  

The system of protein complex prediction.

Methods Description Advantages Limits
Interface residue pair prediction ComplexContact; RaptorX-Contact; RaptorX-Property; Gremlin; DNCON2; PSICOV; FreeContact; LSTM; LSTM with Graph Representation Direct evolutionary coupling analysis (DCA), machine learning and deep learning methods Interfacial residue pair prediction can help subsequent protein complex structure predictions, such as docking.Protein contact map prediction can help reconstruct the three-dimensional structure of protein complexes. The accuracy of interface residues for prediction needs to be improved.
Protein structure prediction Template-free ZDOCK; GRAMM; DOT; SmoothDock; ClusPro; MolFit; FTDock; 3D-Dock; PIPER; pyDock; HDOCK; SDOCK; HEX; FRODOCK; InterEvDock; MDockPP; CoDockPP; HSYMDOCK; SAM; RosettaDock; ICM-DOCK; HADDOCK; ATTRACT; DARWIN; Multi-LZerD; AutoDock The search strategies of these methods are mainly FFT, GA and MC. Protein docking can give all possible complex structures, some of which can also dock Cn and Dn complexes. Designing an effective scoring function to sort the docking structure remains to be further explored.
Template-based InterPreTS; Multimeric threading approach; M-Tasser; PISA; ProtCID Using sequence or structure similarity to model protein complexes with known structures. Template-based methods mainly reduce the possible structure by restricting the direction of protein binding. This method is more efficient than docking and can be applied to larger-scale protein complex prediction. For proteins without a template, the structure of the complex cannot be predicted.
Protein complex prediction from PPI networks Complex prediction based on PPI network clustering MCODE; MCL; SPC; LCMA; SuperComplex; BN; CFinder; DPClus; IPCA; CMC; ClusterONE; HACO The protein complex is part of a known PPI network, that is, the graph composed of protein complexes and their interactions is a subgraph of the PPI network. Some of these methods only use the PPI network for clustering, and some use additional biological information, including structure, function, organization and co-evolution infornation, etc. The proteins that may form complexes can only be picked out from the existing PPI network.
Complex interaction link prediction from PPI network GGA; HAC; ECT; RWR; MDS; Link-weighted PPI Methods to predict actual links in the network include public neighbors-based methods and distance-based methods. This type of method predicts possible protein--protein interactions based on existing network information. Public neighbors-based methods have limited effect on sparse networks.
Table 2.  

The summary of protein complex calculations.

