中国物理B ›› 2023, Vol. 32 ›› Issue (5): 58902-058902.doi: 10.1088/1674-1056/acb9f9

• • 上一篇    

AG-GATCN: A novel method for predicting essential proteins

Peishi Yang(杨培实), Pengli Lu(卢鹏丽), and Teng Zhang(张腾)   

  1. School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
  • 收稿日期:2022-10-18 修回日期:2022-12-08 接受日期:2023-02-08 出版日期:2023-04-21 发布日期:2023-05-05
  • 通讯作者: Pengli Lu E-mail:lupengli88@163.com
  • 基金资助:
    Project supported by the National Natural Science Foundation of China (Grant Nos. 11861045, 11361033, and 62162040).

AG-GATCN: A novel method for predicting essential proteins

Peishi Yang(杨培实), Pengli Lu(卢鹏丽), and Teng Zhang(张腾)   

  1. School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
  • Received:2022-10-18 Revised:2022-12-08 Accepted:2023-02-08 Online:2023-04-21 Published:2023-05-05
  • Contact: Pengli Lu E-mail:lupengli88@163.com
  • Supported by:
    Project supported by the National Natural Science Foundation of China (Grant Nos. 11861045, 11361033, and 62162040).

摘要: Essential proteins play an important role in disease diagnosis and drug development. Many methods have been devoted to the essential protein prediction by using some kinds of biological information. However, they either ignore the noise presented in the biological information itself or the noise generated during feature extraction. To overcome these problems, in this paper, we propose a novel method for predicting essential proteins called attention gate-graph attention network and temporal convolutional network (AG-GATCN). In AG-GATCN method, we use improved temporal convolutional network (TCN) to extract features from gene expression sequence. To address the noise in the gene expression sequence itself and the noise generated after the dilated causal convolution, we introduce attention mechanism and gating mechanism in TCN. In addition, we use graph attention network (GAT) to extract protein-protein interaction (PPI) network features, in which we construct the feature matrix by introducing node2vec technique and 7 centrality metrics, and to solve the GAT oversmoothing problem, we introduce gated tanh unit (GTU) in GAT. Finally, two types of features are integrated by us to predict essential proteins. Compared with the existing methods for predicting essential proteins, the experimental results show that AG-GATCN achieves better performance.

关键词: complex networks, essential proteins, temporal convolutional network, graph attention network, gene expression

Abstract: Essential proteins play an important role in disease diagnosis and drug development. Many methods have been devoted to the essential protein prediction by using some kinds of biological information. However, they either ignore the noise presented in the biological information itself or the noise generated during feature extraction. To overcome these problems, in this paper, we propose a novel method for predicting essential proteins called attention gate-graph attention network and temporal convolutional network (AG-GATCN). In AG-GATCN method, we use improved temporal convolutional network (TCN) to extract features from gene expression sequence. To address the noise in the gene expression sequence itself and the noise generated after the dilated causal convolution, we introduce attention mechanism and gating mechanism in TCN. In addition, we use graph attention network (GAT) to extract protein-protein interaction (PPI) network features, in which we construct the feature matrix by introducing node2vec technique and 7 centrality metrics, and to solve the GAT oversmoothing problem, we introduce gated tanh unit (GTU) in GAT. Finally, two types of features are integrated by us to predict essential proteins. Compared with the existing methods for predicting essential proteins, the experimental results show that AG-GATCN achieves better performance.

Key words: complex networks, essential proteins, temporal convolutional network, graph attention network, gene expression

中图分类号:  (Complex systems)

  • 89.75.-k