|
|
Hyperparameter on-line learning of stochastic resonance based threshold networks |
Weijin Li(李伟进), Yuhao Ren(任昱昊), and Fabing Duan(段法兵)† |
College of Automation, Qingdao University, Qingdao 266071, China |
|
|
Abstract Aiming at training the feed-forward threshold neural network consisting of nondifferentiable activation functions, the approach of noise injection forms a stochastic resonance based threshold network that can be optimized by various gradient-based optimizers. The introduction of injected noise extends the noise level into the parameter space of the designed threshold network, but leads to a highly non-convex optimization landscape of the loss function. Thus, the hyperparameter on-line learning procedure with respective to network weights and noise levels becomes of challenge. It is shown that the Adam optimizer, as an adaptive variant of stochastic gradient descent, manifests its superior learning ability in training the stochastic resonance based threshold network effectively. Experimental results demonstrate the significant improvement of performance of the designed threshold network trained by the Adam optimizer for function approximation and image classification.
|
Received: 06 January 2022
Revised: 16 February 2022
Accepted manuscript online: 25 February 2022
|
PACS:
|
05.40.Ca
|
(Noise)
|
|
02.50.-r
|
(Probability theory, stochastic processes, and statistics)
|
|
84.35.+i
|
(Neural networks)
|
|
Fund: Project supported by the Natural Science Foundation of Shandong Province, China (Grant No. ZR2021MF051). |
Corresponding Authors:
Fabing Duan
E-mail: fabingduan@qdu.edu.cn
|
Cite this article:
Weijin Li(李伟进), Yuhao Ren(任昱昊), and Fabing Duan(段法兵) Hyperparameter on-line learning of stochastic resonance based threshold networks 2022 Chin. Phys. B 31 080503
|
[1] Nair V and Hinton G E 2010 Proceeding of the m27th International Conference on Machine Learning, June 21-24, 2010, Haifa, Israel, pp. 807-814 [2] Glorot X, Bordes A and Bengio Y 2011 Proceedings of the m14th International Conference on Artificial Intelligence and Statistics, May 13-15, 2011, Sardinia, Italy, pp. 315-323 [3] Courbariaux M and Bengio Y 2016 arXiv:1602.02830v3[cs] [4] Gulcehre C, Moczulski M, Denil M and Bengio Y 2016 Proceedings of the m33rd International Conference on Machine Learning, June 19-24, 2016, New York, pp. 3059-3068 [5] Rastegari M, Ordonez V, Redmon J and Farhadi A 2016 The 14th European Conference on Computer Vision, October 8-16, 2016, Amsterdam, pp. 525-542 [6] Ikemoto S, DallaLibera F and Hosoda K 2018 Neurocomputing 277 29 [7] Ignatov D and Ignatov A 2020 Pattern Recognition Lett. 138 276 [8] Qin H T, Gong R H, Liu X L, Bai X, Song J K and Sebe N 2020 Pattern Recognition 105 107281 [9] Toms D J 1990 Electron. Lett. 26 1745 [10] Corwin E M, Logar A M and Oldham W J B 1994 IEEE Trans. Neural Networks 5 507 [11] Wilson E and Rock S M 2002 Int. J. Robust Nonlin. Control 12 1009 [12] Ikemoto S 2021 Neurocomputing 448 1 [13] Duan L L, Duan F B, Chapeau-Blondeau F and Abbott D 2021 IEEE Trans. Instrum. Measurement 70 1010612 [14] Qiao Z J and Shu X D 2021 Chaos, Solitons & Fractals 145 110813 [15] Bishop C M 1995 Neural Comput. 7 108 [16] An G 1996 Neural Comput. 8 643 [17] Grandvalet Y, Canu S and Boucheron S 1997 Neural Comput. 9 1093 [18] Grandvalet Y, Canu S and Boucheron S 2016 Neural Networks 78 15 [19] Adigun O and Kosko B 2019 Neural Networks 120 9 [20] Frazier-Logue N and Hanson S J 2020 Neural Comput. 32 1018 [21] Frazier-Logue N and Hanson S J 2020 Neural Comput. 32 1 [22] Benzi R, Sutera A and Vulpiani A 1981 J. Phys. A:Math. Gen. 14 L453 [23] Kosko B, Audhkhasi K and Osoba O 2020 Neural Networks 129 359 [24] Jiang L, Lai L, Yu T and Luo M K 2021 Chin. Phys. B 30 060502 [25] Jin Y F 2018 Chin. Phys. B 27 050501 [26] He L F, Cui Y Y, Zhang T Q, Zhang G and Song Y 2016 Chin. Phys. B 25 060501 [27] Liu J, Wang Y G, Zhai Q Q and Liu J 2016 Chin. Phys. B 25 100501 [28] Li J L 2009 Chin. Phys. B 18 5196 [29] Liu X J, Duan L L, Duan F B, Chapeau-Blondeau F and Abbott D 2021 Phys. Lett. A 403 127387 [30] Han J and Bao J D 2014 Chin. Phys. Lett. 31 120502 [31] Qiao Z J, Elhattab A, Shu X D and He C B 2021 Nonlin. Dyn. 106 707 [32] Qiao Z J, Liu J, Ma X and Liu J L 2021 J. Franklin Institute 358 2194 [33] Song Y L 2011 Chin. Phys. Lett. 28 120502 [34] Kingma D P and Ba J 2015 International Conference on Learning Representations, May 7-9, 2015, San Diego, CA, pp. 1-15 [35] Bock S and Weiβ M 2019 International Conference on Learning Representations, May 6-9, 2019, Bugdapest, Hungary, pp. 14-19 [36] Cong G and Buratti L 2018 IEEE/ACM Machine Learning in HPC Environments (MLHPC), Nov 12-12, 2018, Dallas, TX, pp. 85-94 [37] Chaudhury S and Yamasaki T 2021 IEEE Access 9 37039 [38] Liu Z, Shen Z, Li S, Helwegen K, Huang D and Cheng K T 2021 arXiv:2106.11309[cs] [39] Stocks N G 2000 Phys. Rev. Lett. 84 2310 [40] Sanjuán M A F 2010 Contemp. Phys. 51 448 [41] McCulloch, Warren S and Pitts W 1943 Bull. Math. Biophys. 5 115 [42] Lopez R, Balsa-Canto E and Oñate E 2008 Int. J. Numer. Methods Engin. 75 1341 [43] Quinlan J R 1993 Proceedings of The Tenth International Conference on Machine Learning, June 27-29, 1993, University of Massachusetts, Amherst, pp. 236-243 [44] Rafiei M H and Adeli H 2016 J. Construct. Engin. Manag. 142 04015066 [45] Cortez P, Cerdeira A, Almeida F, Matos T and Reis J 2009 Decision Support Systems 47 547 [46] Cassotti M, Ballabio D, Todeschini R and Consonni V 2015 SAR and QSAR Environ. Res. 26 217 [47] Yeh I C and Hsu T K 2018 Appl. Soft Comput. 65 260 [48] Yeh I C 1998 Cement Concrete Res. 28 1797 [49] Yeh I C 2008 Comput. Concrete 5 559 [50] Gerritsma J, Onnink R and Versluis A 1981 Int. Shipbuild. Prog. 28 276 [51] Liu W F, We J C and Meng Q M 2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications, August 25-27, 2020, Dalian, China, pp. 587-590 |
No Suggested Reading articles found! |
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
Altmetric
|
blogs
Facebook pages
Wikipedia page
Google+ users
|
Online attention
Altmetric calculates a score based on the online attention an article receives. Each coloured thread in the circle represents a different type of online attention. The number in the centre is the Altmetric score. Social media and mainstream news media are the main sources that calculate the score. Reference managers such as Mendeley are also tracked but do not contribute to the score. Older articles often score higher because they have had more time to get noticed. To account for this, Altmetric has included the context data for other articles of a similar age.
View more on Altmetrics
|
|
|