中国物理B ›› 2025, Vol. 34 ›› Issue (5): 50701-050701.doi: 10.1088/1674-1056/adbedd

所属专题: SPECIAL TOPIC — Computational programs in complex systems

• • 上一篇    下一篇

Text-guided diverse-expression diffusion model for molecule generation

Wenchao Weng(翁文超)1,†, Hanyu Jiang(蒋涵羽)2,†, Xiangjie Kong(孔祥杰)1,‡, and Giovanni Pau3   

  1. 1 College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310014, China;
    2 Hangzhou Dianzi University ITMO Joint Institute, Hangzhou Dianzi University, Hangzhou 310018, China;
    3 Faculty of Engineering and Architecture, Kore University of Enna, Italy
  • 收稿日期:2024-11-18 修回日期:2025-02-21 接受日期:2025-03-11 出版日期:2025-05-15 发布日期:2025-04-18
  • 通讯作者: Xiangjie Kong E-mail:xjkong@ieee.org
  • 基金资助:
    Project supported in part by the National Natural Science Foundation of China (Grant Nos. 62476247 and 62072409), the “Pioneer” and “Leading Goose” R&D Program of Zhejiang (Grant No. 2024C01214), and the Zhejiang Provincial Natural Science Foundation (Grant No. LR21F020003).

Text-guided diverse-expression diffusion model for molecule generation

Wenchao Weng(翁文超)1,†, Hanyu Jiang(蒋涵羽)2,†, Xiangjie Kong(孔祥杰)1,‡, and Giovanni Pau3   

  1. 1 College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310014, China;
    2 Hangzhou Dianzi University ITMO Joint Institute, Hangzhou Dianzi University, Hangzhou 310018, China;
    3 Faculty of Engineering and Architecture, Kore University of Enna, Italy
  • Received:2024-11-18 Revised:2025-02-21 Accepted:2025-03-11 Online:2025-05-15 Published:2025-04-18
  • Contact: Xiangjie Kong E-mail:xjkong@ieee.org
  • Supported by:
    Project supported in part by the National Natural Science Foundation of China (Grant Nos. 62476247 and 62072409), the “Pioneer” and “Leading Goose” R&D Program of Zhejiang (Grant No. 2024C01214), and the Zhejiang Provincial Natural Science Foundation (Grant No. LR21F020003).

摘要: The task of molecule generation guided by specific text descriptions has been proposed to generate molecules that match given text inputs. Mainstream methods typically use simplified molecular input line entry system (SMILES) to represent molecules and rely on diffusion models or autoregressive structures for modeling. However, the one-to-many mapping diversity when using SMILES to represent molecules causes existing methods to require complex model architectures and larger training datasets to improve performance, which affects the efficiency of model training and generation. In this paper, we propose a text-guided diverse-expression diffusion (TGDD) model for molecule generation. TGDD combines both SMILES and self-referencing embedded strings (SELFIES) into a novel diverse-expression molecular representation, enabling precise molecule mapping based on natural language. By leveraging this diverse-expression representation, TGDD simplifies the segmented diffusion generation process, achieving faster training and reduced memory consumption, while also exhibiting stronger alignment with natural language. TGDD outperforms both TGM-LDM and the autoregressive model MolT5-Base on most evaluation metrics.

关键词: molecule generation, diffusion model, AI for science

Abstract: The task of molecule generation guided by specific text descriptions has been proposed to generate molecules that match given text inputs. Mainstream methods typically use simplified molecular input line entry system (SMILES) to represent molecules and rely on diffusion models or autoregressive structures for modeling. However, the one-to-many mapping diversity when using SMILES to represent molecules causes existing methods to require complex model architectures and larger training datasets to improve performance, which affects the efficiency of model training and generation. In this paper, we propose a text-guided diverse-expression diffusion (TGDD) model for molecule generation. TGDD combines both SMILES and self-referencing embedded strings (SELFIES) into a novel diverse-expression molecular representation, enabling precise molecule mapping based on natural language. By leveraging this diverse-expression representation, TGDD simplifies the segmented diffusion generation process, achieving faster training and reduced memory consumption, while also exhibiting stronger alignment with natural language. TGDD outperforms both TGM-LDM and the autoregressive model MolT5-Base on most evaluation metrics.

Key words: molecule generation, diffusion model, AI for science

中图分类号:  (Data analysis: algorithms and implementation; data management)

  • 07.05.Kf
07.05.Mh (Neural networks, fuzzy logic, artificial intelligence) 07.05.Tp (Computer modeling and simulation)