Abstract Aiming at training the feed-forward threshold neural network consisting of nondifferentiable activation functions, the approach of noise injection forms a stochastic resonance based threshold network that can be optimized by various gradient-based optimizers. The introduction of injected noise extends the noise level into the parameter space of the designed threshold network, but leads to a highly non-convex optimization landscape of the loss function. Thus, the hyperparameter on-line learning procedure with respective to network weights and noise levels becomes of challenge. It is shown that the Adam optimizer, as an adaptive variant of stochastic gradient descent, manifests its superior learning ability in training the stochastic resonance based threshold network effectively. Experimental results demonstrate the significant improvement of performance of the designed threshold network trained by the Adam optimizer for function approximation and image classification.
Weijin Li(李伟进), Yuhao Ren(任昱昊), and Fabing Duan(段法兵) Hyperparameter on-line learning of stochastic resonance based threshold networks 2022 Chin. Phys. B 31 080503
[1] Nair V and Hinton G E 2010 Proceeding of the m27th International Conference on Machine Learning, June 21-24, 2010, Haifa, Israel, pp. 807-814 [2] Glorot X, Bordes A and Bengio Y 2011 Proceedings of the m14th International Conference on Artificial Intelligence and Statistics, May 13-15, 2011, Sardinia, Italy, pp. 315-323 [3] Courbariaux M and Bengio Y 2016 arXiv:1602.02830v3[cs] [4] Gulcehre C, Moczulski M, Denil M and Bengio Y 2016 Proceedings of the m33rd International Conference on Machine Learning, June 19-24, 2016, New York, pp. 3059-3068 [5] Rastegari M, Ordonez V, Redmon J and Farhadi A 2016 The 14th European Conference on Computer Vision, October 8-16, 2016, Amsterdam, pp. 525-542 [6] Ikemoto S, DallaLibera F and Hosoda K 2018 Neurocomputing277 29 [7] Ignatov D and Ignatov A 2020 Pattern Recognition Lett.138 276 [8] Qin H T, Gong R H, Liu X L, Bai X, Song J K and Sebe N 2020 Pattern Recognition105 107281 [9] Toms D J 1990 Electron. Lett.26 1745 [10] Corwin E M, Logar A M and Oldham W J B 1994 IEEE Trans. Neural Networks5 507 [11] Wilson E and Rock S M 2002 Int. J. Robust Nonlin. Control12 1009 [12] Ikemoto S 2021 Neurocomputing448 1 [13] Duan L L, Duan F B, Chapeau-Blondeau F and Abbott D 2021 IEEE Trans. Instrum. Measurement70 1010612 [14] Qiao Z J and Shu X D 2021 Chaos, Solitons & Fractals145 110813 [15] Bishop C M 1995 Neural Comput.7 108 [16] An G 1996 Neural Comput.8 643 [17] Grandvalet Y, Canu S and Boucheron S 1997 Neural Comput.9 1093 [18] Grandvalet Y, Canu S and Boucheron S 2016 Neural Networks78 15 [19] Adigun O and Kosko B 2019 Neural Networks120 9 [20] Frazier-Logue N and Hanson S J 2020 Neural Comput.32 1018 [21] Frazier-Logue N and Hanson S J 2020 Neural Comput.32 1 [22] Benzi R, Sutera A and Vulpiani A 1981 J. Phys. A:Math. Gen.14 L453 [23] Kosko B, Audhkhasi K and Osoba O 2020 Neural Networks129 359 [24] Jiang L, Lai L, Yu T and Luo M K 2021 Chin. Phys. B30 060502 [25] Jin Y F 2018 Chin. Phys. B27 050501 [26] He L F, Cui Y Y, Zhang T Q, Zhang G and Song Y 2016 Chin. Phys. B25 060501 [27] Liu J, Wang Y G, Zhai Q Q and Liu J 2016 Chin. Phys. B25 100501 [28] Li J L 2009 Chin. Phys. B18 5196 [29] Liu X J, Duan L L, Duan F B, Chapeau-Blondeau F and Abbott D 2021 Phys. Lett. A403 127387 [30] Han J and Bao J D 2014 Chin. Phys. Lett.31 120502 [31] Qiao Z J, Elhattab A, Shu X D and He C B 2021 Nonlin. Dyn.106 707 [32] Qiao Z J, Liu J, Ma X and Liu J L 2021 J. Franklin Institute358 2194 [33] Song Y L 2011 Chin. Phys. Lett.28 120502 [34] Kingma D P and Ba J 2015 International Conference on Learning Representations, May 7-9, 2015, San Diego, CA, pp. 1-15 [35] Bock S and Weiβ M 2019 International Conference on Learning Representations, May 6-9, 2019, Bugdapest, Hungary, pp. 14-19 [36] Cong G and Buratti L 2018 IEEE/ACM Machine Learning in HPC Environments (MLHPC), Nov 12-12, 2018, Dallas, TX, pp. 85-94 [37] Chaudhury S and Yamasaki T 2021 IEEE Access9 37039 [38] Liu Z, Shen Z, Li S, Helwegen K, Huang D and Cheng K T 2021 arXiv:2106.11309[cs] [39] Stocks N G 2000 Phys. Rev. Lett.84 2310 [40] Sanjuán M A F 2010 Contemp. Phys.51 448 [41] McCulloch, Warren S and Pitts W 1943 Bull. Math. Biophys.5 115 [42] Lopez R, Balsa-Canto E and Oñate E 2008 Int. J. Numer. Methods Engin.75 1341 [43] Quinlan J R 1993 Proceedings of The Tenth International Conference on Machine Learning, June 27-29, 1993, University of Massachusetts, Amherst, pp. 236-243 [44] Rafiei M H and Adeli H 2016 J. Construct. Engin. Manag.142 04015066 [45] Cortez P, Cerdeira A, Almeida F, Matos T and Reis J 2009 Decision Support Systems47 547 [46] Cassotti M, Ballabio D, Todeschini R and Consonni V 2015 SAR and QSAR Environ. Res.26 217 [47] Yeh I C and Hsu T K 2018 Appl. Soft Comput.65 260 [48] Yeh I C 1998 Cement Concrete Res.28 1797 [49] Yeh I C 2008 Comput. Concrete5 559 [50] Gerritsma J, Onnink R and Versluis A 1981 Int. Shipbuild. Prog.28 276 [51] Liu W F, We J C and Meng Q M 2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications, August 25-27, 2020, Dalian, China, pp. 587-590
Altmetric calculates a score based on the online attention an article receives. Each coloured thread in the circle represents a different type of online attention. The number in the centre is the Altmetric score. Social media and mainstream news media are the main sources that calculate the score. Reference managers such as Mendeley are also tracked but do not contribute to the score. Older articles often score higher because they have had more time to get noticed. To account for this, Altmetric has included the context data for other articles of a similar age.