中国物理B ›› 2024, Vol. 33 ›› Issue (5): 50301-050301.doi: 10.1088/1674-1056/ad3061

所属专题: SPECIAL TOPIC — Quantum computing and quantum sensing

• • 上一篇    下一篇

Quafu-RL: The cloud quantum computers based quantum reinforcement learning

Yu-Xin Jin(靳羽欣)1,7,†, Hong-Ze Xu(许宏泽)1,†, Zheng-An Wang(王正安)1, Wei-Feng Zhuang(庄伟峰)1, Kai-Xuan Huang(黄凯旋)1, Yun-Hao Shi(时运豪)3,5,6, Wei-Guo Ma(马卫国)3,5,6, Tian-Ming Li(李天铭)3,5,6, Chi-Tong Chen(陈驰通)3,5,6, Kai Xu(许凯)3,1, Yu-Long Feng(冯玉龙)1, Pei Liu(刘培)1, Mo Chen(陈墨)1, Shang-Shu Li(李尚书)3,5,6, Zhi-Peng Yang(杨智鹏)1, Chen Qian(钱辰)1, Yun-Heng Ma(马运恒)1, Xiao Xiao(肖骁)1, Peng Qian(钱鹏)1, Yanwu Gu(顾炎武)1, Xu-Dan Chai(柴绪丹)1, Ya-Nan Pu(普亚南)1, Yi-Peng Zhang(张翼鹏)1, Shi-Jie Wei(魏世杰)1, Jin-Feng Zeng(曾进峰)1, Hang Li(李行)1, Gui-Lu Long(龙桂鲁)2,1, Yirong Jin(金贻荣)1, Haifeng Yu(于海峰)1, Heng Fan(范桁)3,1,5,6, Dong E. Liu(刘东)2,1,4, and Meng-Jun Hu(胡孟军)1,‡   

  1. 1 Beijing Academy of Quantum Information Sciences, Beijing 100193, China;
    2 State Key Laboratory of Low Dimensional Quantum Physics, Department of Physics, Tsinghua University, Beijing 100084, China;
    3 Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China;
    4 Frontier Science Center for Quantum Information, Beijing 100184, China;
    5 School of Physical Sciences, University of Chinese Academy of Sciences, Beijing 100190, China;
    6 CAS Center for Excellence in Topological Quantum Computation, University of Chinese Academy of Sciences, Beijing 100190, China;
    7 School of Mathematical Sciences, Nankai University, Tianjin 300071, China
  • 收稿日期:2023-12-23 修回日期:2024-03-03 接受日期:2024-03-06 出版日期:2024-05-20 发布日期:2024-05-20
  • 通讯作者: Meng-Jun Hu E-mail:humj@baqis.ac.cn
  • 基金资助:
    This work is supported by the Beijing Academy of Quantum Information Sciences. Haifeng Yu, Meng-Jun Hu and Wei-Feng Zhuang are supported by the National Natural Science Foundation of China (Grant No. 92365206). Hong-Ze Xu acknowledges the support of the China Postdoctoral Science Foundation (Certificate Number: 2023M740272). Zheng-An Wang is supported by the National Natural Science Foundation of China (Grant No. 12247168) and China Postdoctoral Science Foundation (Certificate Number: 2022TQ0036).

Quafu-RL: The cloud quantum computers based quantum reinforcement learning

Yu-Xin Jin(靳羽欣)1,7,†, Hong-Ze Xu(许宏泽)1,†, Zheng-An Wang(王正安)1, Wei-Feng Zhuang(庄伟峰)1, Kai-Xuan Huang(黄凯旋)1, Yun-Hao Shi(时运豪)3,5,6, Wei-Guo Ma(马卫国)3,5,6, Tian-Ming Li(李天铭)3,5,6, Chi-Tong Chen(陈驰通)3,5,6, Kai Xu(许凯)3,1, Yu-Long Feng(冯玉龙)1, Pei Liu(刘培)1, Mo Chen(陈墨)1, Shang-Shu Li(李尚书)3,5,6, Zhi-Peng Yang(杨智鹏)1, Chen Qian(钱辰)1, Yun-Heng Ma(马运恒)1, Xiao Xiao(肖骁)1, Peng Qian(钱鹏)1, Yanwu Gu(顾炎武)1, Xu-Dan Chai(柴绪丹)1, Ya-Nan Pu(普亚南)1, Yi-Peng Zhang(张翼鹏)1, Shi-Jie Wei(魏世杰)1, Jin-Feng Zeng(曾进峰)1, Hang Li(李行)1, Gui-Lu Long(龙桂鲁)2,1, Yirong Jin(金贻荣)1, Haifeng Yu(于海峰)1, Heng Fan(范桁)3,1,5,6, Dong E. Liu(刘东)2,1,4, and Meng-Jun Hu(胡孟军)1,‡   

  1. 1 Beijing Academy of Quantum Information Sciences, Beijing 100193, China;
    2 State Key Laboratory of Low Dimensional Quantum Physics, Department of Physics, Tsinghua University, Beijing 100084, China;
    3 Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China;
    4 Frontier Science Center for Quantum Information, Beijing 100184, China;
    5 School of Physical Sciences, University of Chinese Academy of Sciences, Beijing 100190, China;
    6 CAS Center for Excellence in Topological Quantum Computation, University of Chinese Academy of Sciences, Beijing 100190, China;
    7 School of Mathematical Sciences, Nankai University, Tianjin 300071, China
  • Received:2023-12-23 Revised:2024-03-03 Accepted:2024-03-06 Online:2024-05-20 Published:2024-05-20
  • Contact: Meng-Jun Hu E-mail:humj@baqis.ac.cn
  • Supported by:
    This work is supported by the Beijing Academy of Quantum Information Sciences. Haifeng Yu, Meng-Jun Hu and Wei-Feng Zhuang are supported by the National Natural Science Foundation of China (Grant No. 92365206). Hong-Ze Xu acknowledges the support of the China Postdoctoral Science Foundation (Certificate Number: 2023M740272). Zheng-An Wang is supported by the National Natural Science Foundation of China (Grant No. 12247168) and China Postdoctoral Science Foundation (Certificate Number: 2022TQ0036).

摘要: With the rapid advancement of quantum computing, hybrid quantum-classical machine learning has shown numerous potential applications at the current stage, with expectations of being achievable in the noisy intermediate-scale quantum (NISQ) era. Quantum reinforcement learning, as an indispensable study, has recently demonstrated its ability to solve standard benchmark environments with formally provable theoretical advantages over classical counterparts. However, despite the progress of quantum processors and the emergence of quantum computing clouds, implementing quantum reinforcement learning algorithms utilizing parameterized quantum circuits (PQCs) on NISQ devices remains infrequent. In this work, we take the first step towards executing benchmark quantum reinforcement problems on real devices equipped with at most 136 qubits on the BAQIS Quafu quantum computing cloud. The experimental results demonstrate that the policy agents can successfully accomplish objectives under modified conditions in both the training and inference phases. Moreover, we design hardware-efficient PQC architectures in the quantum model using a multi-objective evolutionary algorithm and develop a learning algorithm that is adaptable to quantum devices. We hope that the Quafu-RL can be a guiding example to show how to realize machine learning tasks by taking advantage of quantum computers on the quantum cloud platform.

关键词: quantum cloud platform, quantum reinforcement learning, evolutionary quantum architecture search

Abstract: With the rapid advancement of quantum computing, hybrid quantum-classical machine learning has shown numerous potential applications at the current stage, with expectations of being achievable in the noisy intermediate-scale quantum (NISQ) era. Quantum reinforcement learning, as an indispensable study, has recently demonstrated its ability to solve standard benchmark environments with formally provable theoretical advantages over classical counterparts. However, despite the progress of quantum processors and the emergence of quantum computing clouds, implementing quantum reinforcement learning algorithms utilizing parameterized quantum circuits (PQCs) on NISQ devices remains infrequent. In this work, we take the first step towards executing benchmark quantum reinforcement problems on real devices equipped with at most 136 qubits on the BAQIS Quafu quantum computing cloud. The experimental results demonstrate that the policy agents can successfully accomplish objectives under modified conditions in both the training and inference phases. Moreover, we design hardware-efficient PQC architectures in the quantum model using a multi-objective evolutionary algorithm and develop a learning algorithm that is adaptable to quantum devices. We hope that the Quafu-RL can be a guiding example to show how to realize machine learning tasks by taking advantage of quantum computers on the quantum cloud platform.

Key words: quantum cloud platform, quantum reinforcement learning, evolutionary quantum architecture search

中图分类号:  (Quantum computation architectures and implementations)

  • 03.67.Lx
03.67.Ac (Quantum algorithms, protocols, and simulations)