Please wait a minute...
Chin. Phys. B, 2015, Vol. 24(9): 090504    DOI: 10.1088/1674-1056/24/9/090504
GENERAL Prev   Next  

Off-policy integral reinforcement learning optimal tracking control for continuous-time chaotic systems

Wei Qing-Lai (魏庆来)a, Song Rui-Zhuo (宋睿卓)b, Sun Qiu-Ye (孙秋野)c, Xiao Wen-Dong (肖文栋)b
a The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;
b School of Automation and Electrical Engineering, University of Science and Technology, Beijing 100083, China;
c School of Information Science and Engineering, Northeastern University, Shenyang 110004, China
Abstract  

This paper estimates an off-policy integral reinforcement learning (IRL) algorithm to obtain the optimal tracking control of unknown chaotic systems. Off-policy IRL can learn the solution of the HJB equation from the system data generated by an arbitrary control. Moreover, off-policy IRL can be regarded as a direct learning method, which avoids the identification of system dynamics. In this paper, the performance index function is first given based on the system tracking error and control error. For solving the Hamilton-Jacobi-Bellman (HJB) equation, an off-policy IRL algorithm is proposed. It is proven that the iterative control makes the tracking error system asymptotically stable, and the iterative performance index function is convergent. Simulation study demonstrates the effectiveness of the developed tracking control method.

Keywords:  adaptive dynamic programming      approximate dynamic programming      chaotic system      optimal tracking control  
Received:  18 December 2014      Revised:  28 March 2015      Accepted manuscript online: 
PACS:  05.45.Gg (Control of chaos, applications of chaos)  
Fund: 

Project supported by the National Natural Science Foundation of China (Grant Nos. 61304079 and 61374105), the Beijing Natural Science Foundation, China (Grant Nos. 4132078 and 4143065), the China Postdoctoral Science Foundation (Grant No. 2013M530527), the Fundamental Research Funds for the Central Universities, China (Grant No. FRF-TP-14-119A2), and the Open Research Project from State Key Laboratory of Management and Control for Complex Systems, China (Grant No. 20150104).

Corresponding Authors:  Song Rui-Zhuo     E-mail:  ruizhuosong@ustb.edu.cn

Cite this article: 

Wei Qing-Lai (魏庆来), Song Rui-Zhuo (宋睿卓), Sun Qiu-Ye (孙秋野), Xiao Wen-Dong (肖文栋) Off-policy integral reinforcement learning optimal tracking control for continuous-time chaotic systems 2015 Chin. Phys. B 24 090504

[1] Lü J and Lu J 2003 Chaos Soliton. Fract. 17 127
[2] Xu C and Wu Y 2015 Appl. Math. Model. 39 2295
[3] Ma T, Zhang H and Fu J 2008 Chin. Phys. B 17 4407
[4] Ma T and Fu J 2011 Chin. Phys. B 20 050511
[5] Yang D 2014 Chin. Phys. B 23 010504
[6] Song R, Xiao W, Sun C and Wei Q 2013 Chin. Phys. B 22 090502
[7] Song R, Xiao W and Wei Q 2014 Chin. Phys. B 23 050504
[8] Gao S, Dong H, Sun X and Ning B 2015 Chin. Phys. B 24 010501
[9] Wei Q and Liu D 2014 IEEE Trans. Autom. Sci. Eng. 11 1020
[10] Wei Q and Liu D 2015 Neurocomputing 149 106
[11] Zhang H, Song R, Wei Q and Zhang T 2011 IEEE Trans. Neural Netw. 22 1851
[12] Heydari A and Balakrishnan S 2013 IEEE Trans. Neural Netw. Learn. Syst. 24 145
[13] Song R, Zhang H, Luo Y and Wei Q 2010 Neurocomputing 73 3020
[14] Xu X, Hou Z, Lian C and He H 2013 IEEE Trans. Neural Netw. Learn. Syst. 24 762
[15] Zhang H, Wei Q and Liu D 2011 Automatica 47 207
[16] Luo B, Wu H, Huang T and Liu D 2014 Automatica 50 3281
[17] Luo B, Wu H and Huang T 2015 IEEE Trans. Cybernetics 45 65
[18] Wei Q and Liu D 2013 IET Control Theory and Applications 7 1472
[19] Wei Q, Liu D and Xu Y 2015 Chin. Phys. B 24 030502
[20] Dierks T and Jagannathan S 2012 IEEE Trans. Neural Netw. Learn. Syst. 23 1118
[21] Song R, Xiao W and Zhang H 2013 Neurocomputing 119 212
[22] Huang Y and Liu D 2014 Neurocomputing 125 46
[23] Xu H and Jagannathan S 2013 IEEE Trans. Neural Netw. Learn. Syst. 24 471
[24] Jiang Y and Jiang Z 2012 IEEE Trans. Circ. Syst. II: Express Briefs 59 693
[25] Lü J and Chen G 2002 Int. J. Bifurc. Chaos 12 659
[26] Chen G and Ueta T 1999 Int. J. Bifurc. Chaos 9 1465
[27] Lorenz E 1963 J. Atmospheric Sci. 20 130
[28] Chua L, Komuro M and Matsumoto T 1986 IEEE Trans. Circ. Syst. 33 1072
[29] Wiggins S 1987 Phys. Lett. A 124 138
[30] Jiang Y and Jiang Z 2012 Automatica 48 2699
[31] Lü J, Chen G and Zhang S 2002 Int. J. Bifurc. Chaos 12 1001
[32] Lü J, Chen G and Zhang S 2002 Chaos Soliton. Fract. 14 669
[1] Data encryption based on a 9D complex chaotic system with quaternion for smart grid
Fangfang Zhang(张芳芳), Zhe Huang(黄哲), Lei Kou(寇磊), Yang Li(李扬), Maoyong Cao(曹茂永), and Fengying Ma(马凤英). Chin. Phys. B, 2023, 32(1): 010502.
[2] Exponential sine chaotification model for enhancing chaos and its hardware implementation
Rui Wang(王蕊), Meng-Yang Li(李孟洋), and Hai-Jun Luo(罗海军). Chin. Phys. B, 2022, 31(8): 080508.
[3] Solutions and memory effect of fractional-order chaotic system: A review
Shaobo He(贺少波), Huihai Wang(王会海), and Kehui Sun(孙克辉). Chin. Phys. B, 2022, 31(6): 060501.
[4] The transition from conservative to dissipative flows in class-B laser model with fold-Hopf bifurcation and coexisting attractors
Yue Li(李月), Zengqiang Chen(陈增强), Mingfeng Yuan(袁明峰), and Shijian Cang(仓诗建). Chin. Phys. B, 2022, 31(6): 060503.
[5] Neural-mechanism-driven image block encryption algorithm incorporating a hyperchaotic system and cloud model
Peng-Fei Fang(方鹏飞), Han Liu(刘涵), Cheng-Mao Wu(吴成茂), and Min Liu(刘旻). Chin. Phys. B, 2022, 31(4): 040501.
[6] Color-image encryption scheme based on channel fusion and spherical diffraction
Jun Wang(王君), Yuan-Xi Zhang(张沅熙), Fan Wang(王凡), Ren-Jie Ni(倪仁杰), and Yu-Heng Hu(胡玉衡). Chin. Phys. B, 2022, 31(3): 034205.
[7] Explosive synchronization: From synthetic to real-world networks
Atiyeh Bayani, Sajad Jafari, and Hamed Azarnoush. Chin. Phys. B, 2022, 31(2): 020504.
[8] Acoustic wireless communication based on parameter modulation and complex Lorenz chaotic systems with complex parameters and parametric attractors
Fang-Fang Zhang(张芳芳), Rui Gao(高瑞), and Jian Liu(刘坚). Chin. Phys. B, 2021, 30(8): 080503.
[9] Complex network perspective on modelling chaotic systems via machine learning
Tong-Feng Weng(翁同峰), Xin-Xin Cao(曹欣欣), and Hui-Jie Yang(杨会杰). Chin. Phys. B, 2021, 30(6): 060506.
[10] Energy behavior of Boris algorithm
Abdullah Zafar and Majid Khan. Chin. Phys. B, 2021, 30(5): 055203.
[11] Cascade discrete memristive maps for enhancing chaos
Fang Yuan(袁方), Cheng-Jun Bai(柏承君), and Yu-Xia Li(李玉霞). Chin. Phys. B, 2021, 30(12): 120514.
[12] Dynamical analysis, circuit realization, and application in pseudorandom number generators of a fractional-order laser chaotic system
Chenguang Ma(马晨光), Santo Banerjee, Li Xiong(熊丽), Tianming Liu(刘天明), Xintong Han(韩昕彤), and Jun Mou(牟俊). Chin. Phys. B, 2021, 30(12): 120504.
[13] Adaptive synchronization of chaotic systems with less measurement and actuation
Shun-Jie Li(李顺杰), Ya-Wen Wu(吴雅文), and Gang Zheng(郑刚). Chin. Phys. B, 2021, 30(10): 100503.
[14] Design and multistability analysis of five-value memristor-based chaotic system with hidden attractors
Li-Lian Huang(黄丽莲), Shuai Liu(刘帅), Jian-Hong Xiang(项建弘), and Lin-Yu Wang(王霖郁). Chin. Phys. B, 2021, 30(10): 100506.
[15] A novel method of constructing high-dimensional digital chaotic systems on finite-state automata
Jun Zheng(郑俊), Han-Ping Hu(胡汉平). Chin. Phys. B, 2020, 29(9): 090502.
No Suggested Reading articles found!