|
|
A novel stable value iteration-based approximate dynamic programming algorithm for discrete-time nonlinear systems |
Yan-Hua Qu(曲延华), An-Na Wang(王安娜), Sheng Lin(林盛) |
College of Information Science and Engineering, Northeastern University, Shenyang 110819, China |
|
|
Abstract The convergence and stability of a value-iteration-based adaptive dynamic programming (ADP) algorithm are considered for discrete-time nonlinear systems accompanied by a discounted quadric performance index. More importantly than sufficing to achieve a good approximate structure, the iterative feedback control law must guarantee the closed-loop stability. Specifically, it is firstly proved that the iterative value function sequence will precisely converge to the optimum. Secondly, the necessary and sufficient condition of the optimal value function serving as a Lyapunov function is investigated. We prove that for the case of infinite horizon, there exists a finite horizon length of which the iterative feedback control law will provide stability, and this increases the practicability of the proposed value iteration algorithm. Neural networks (NNs) are employed to approximate the value functions and the optimal feedback control laws, and the approach allows the implementation of the algorithm without knowing the internal dynamics of the system. Finally, a simulation example is employed to demonstrate the effectiveness of the developed optimal control method.
|
Received: 04 July 2017
Revised: 11 October 2017
Accepted manuscript online:
|
PACS:
|
02.60.Gf
|
(Algorithms for functional approximation)
|
|
02.30.Jr
|
(Partial differential equations)
|
|
02.30.Yy
|
(Control theory)
|
|
Corresponding Authors:
Yan-Hua Qu
E-mail: quyanhuawang@sina.com
|
Cite this article:
Yan-Hua Qu(曲延华), An-Na Wang(王安娜), Sheng Lin(林盛) A novel stable value iteration-based approximate dynamic programming algorithm for discrete-time nonlinear systems 2018 Chin. Phys. B 27 010203
|
[1] |
Bellman R E 1957 Dynamic Programming (Princeton: Princeton University Press)
|
[2] |
Zhang H G, Liu D R, Luo Y H and Wang D 2013 Adaptive Dynamic Programming for Control-algorithms and Stability (London: SpringerVerlag)
|
[3] |
Werbos P J 1992 Approximate Dynamic Programming for Real-time Control and Neural Modeling, Handbook of Intelligent Control (New York: Van Nostrand Reinhold)
|
[4] |
Bertsekas D P and Tsitsiklis J N 1995 Proceedings of the 34th IEEE Conference on IEEE, 1995, pp. 560-564
|
[5] |
Prokhorov D V and Wunsch D C 1997 IEEE Trans. Neural Netw. 8 997
|
[6] |
Wang F Y, Zhang H G and Liu D R 2009 IEEE Computational Intelligence Magazine 4 39
|
[7] |
Doya K 2000 Neural Computation 12 219
|
[8] |
Liang H J, Li H Y, Yu Z D, Li P and Wang W 2017 IET Control Theory & Applications 11 1928
|
[9] |
Hanselmann T, Noakes L and Zaknich A 2007 IEEE Trans on Neural Networks 18 631
|
[10] |
Wei Q L, Shi G, Song R Z and Liu Y 2017 IEEE Trans. Indus. Elect. 64 5468
|
[11] |
Howard R 1960 Dynamic Programming and Markov Processes (Cambridge: MIT Press)
|
[12] |
Song R Z, Xiao W D, Sun C Y and Wei Q L 2015 Chin. Phys. B 22090502
|
[13] |
Wei Q L, Lewis F L, Shi G and Song R Z 2017 IEEE Trans. Indus. Elect. 64 9527
|
[14] |
Leake R J and Liu R W 1967 J. SIAM Control 5 54
|
[15] |
Song R Z and Wei Q L 2017 Chin. Phys. B 26 030505
|
[16] |
Wei Q L, Song R Z, S Q Y and Xiao W D 2015 Chin. Phys. B 24090504
|
[17] |
Beard R, Saridis G and Wen J 1997 Automatica 33 2158
|
[18] |
Wei Q L, Liu D R and Xu Y C 2015 Chin. Phys. B 24 030502
|
[19] |
Song R Z, Xiao W D and Wei Q L 2014 Chin. Phys. B 23 050504
|
[20] |
Wei Q L, Liu D R, Lewis F L, Liu Y and Zhang J 2017 IEEE Trans. Indus. Elect. 64 4110
|
[21] |
Zhang Y, Zhang Z, Qian H and Hu G 2017 Chin. Phys. B 26 100508
|
[22] |
Wei Q L, Liu D R, and Lin H Q 2016 IEEE Trans. Cybernetics 46 840
|
[23] |
Hong Y Y, Qiang Z and Qi J Z 2017 Chin. Phys. B 26 100506
|
[24] |
Wei Q L, Lewis F L, Sun Q Y, Yan P F and Song R Z 2017 IEEE Trans. Cybernetics 47 1224
|
[25] |
Abu-Khalaf M and Lewis F L 2005 Automatica 41 779
|
[26] |
Zhang H G, Wei Q L and Liu D R 2011 Automatica 47 207
|
[27] |
Vrabie D and Lewis F L 2009 Proceedings of International Joint Conference on Neural Networks, Atlanta, pp. 3224-3231
|
[28] |
Lewis F L and Vamvoudakis K G 2011 IEEE Trans. Syst., Man, Cybern. B 41 14
|
[29] |
Song R Z, Lewis F L and Wei Q L Vrabie D and Lewis F L 2017 IEEE Trans. Neural Netw. Learn. Syst. 28 704
|
[30] |
Zhang H G, Luo Y H and Liu D R 2009 IEEE Trans. Neural Netw. 201490
|
[31] |
Zhang H G, Wei Q L and Luo Y H 2008 IEEE Trans. Syst., Man, Cybern. B 38 937
|
[32] |
Al-Tamimi A, Lewis F L and Abu-Khalaf M 2008 IEEE Trans. Syst., Man, Cybern. B 38 943
|
[33] |
Li H L, Liu D R and Wang D 2012 Proceedings of the 31st Chinese Control Conference, July 25-27, 2012, Hefei, China, p. 2932
|
[34] |
Wei Q L and Liu D R Vrabie D and Lewis F L 2012 WCCI 2012 IEEE World Congress on Computational Intelligence, June, 10-15, 2012, Brisbane, Australia
|
[35] |
Wang D, Liu D R, Wei Q L, Zhao D B and Jin N 2012 Automatica 481825
|
[36] |
Primbs J A and Nevistic V 2000 Automatica 36 965
|
[37] |
Lewis F L and Vrabie D 2009 Proceedings of the 7th Asian Control Conference, Hong Kong, China, August 27-29, 2009
|
[38] |
Rantzer A IEE Proc. Control Theory Appl. 153 567
|
[39] |
Lincoln B and Rantzer A Vrabie D and Lewis F L 2006 IEEE Trans. Autom. Con. 51 1249
|
No Suggested Reading articles found! |
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
Altmetric
|
blogs
Facebook pages
Wikipedia page
Google+ users
|
Online attention
Altmetric calculates a score based on the online attention an article receives. Each coloured thread in the circle represents a different type of online attention. The number in the centre is the Altmetric score. Social media and mainstream news media are the main sources that calculate the score. Reference managers such as Mendeley are also tracked but do not contribute to the score. Older articles often score higher because they have had more time to get noticed. To account for this, Altmetric has included the context data for other articles of a similar age.
View more on Altmetrics
|
|
|