Machine learning identification of impurities in the STM images

doi:10.1088/1674-1056/abc0d5

Abstract

We train a neural network to identify impurities in the experimental images obtained by the scanning tunneling microscope (STM) measurements. The neural network is first trained with a large number of simulated data and then the trained neural network is applied to identify a set of experimental images taken at different voltages. We use the convolutional neural network to extract features from the images and also implement the attention mechanism to capture the correlations between images taken at different voltages. We note that the simulated data can capture the universal Friedel oscillation but cannot properly describe the non-universal physics short-range physics nearby an impurity, as well as noises in the experimental data. And we emphasize that the key of this approach is to properly deal with these differences between simulated data and experimental data. Here we show that even by including uncorrelated white noises in the simulated data, the performance of the neural network on experimental data can be significantly improved. To prevent the neural network from learning unphysical short-range physics, we also develop another method to evaluate the confidence of the neural network prediction on experimental data and to add this confidence measure into the loss function. We show that adding such an extra loss function can also improve the performance on experimental data. Our research can inspire future similar applications of machine learning on experimental data analysis.

Keywords: scanning tunneling microscope neural network attention data regularization

Received: 02 July 2020
Revised: 07 August 2020
Accepted manuscript online: 14 October 2020

Fund: HZ is supported by Beijing Outstanding Scholar Program, the National Key Research and Development Program of China (Grant No. 2016YFA0301600), and the National Natural Science Foundation of China (Grant No. 11734010). YZY is supported by a startup fund from UCSD. PC is supported by the Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China.

Corresponding Authors: ^†Corresponding author. E-mail: pcai@ruc.edu.cn ^‡Corresponding author. E-mail: yzyou@ucsd.edu ^§Corresponding author. E-mail:hzhai@tsinghua.edu.cn

Cite this article:

Ce Wang(王策), Haiwei Li(李海威), Zhenqi Hao(郝镇齐), Xintong Li(李昕彤), Changwei Zou(邹昌炜), Peng Cai(蔡鹏), Yayu Wang(王亚愚), Yi-Zhuang You(尤亦庄), and Hui Zhai(翟荟) Machine learning identification of impurities in the STM images 2020 Chin. Phys. B 29 116805

Fig. 1.

The structure of the NN. The voltage arrow indicates a sequence of images taken at different voltages. Each image $X_{i}^{l}$ is mapped to tensors of the query $Q_{i}^{l}$ , the key $K_{i}^{l}$ , and the value $V_{i}^{l}$ . By the attention mechanism, it leads to normalized value ${\tilde{V}}_{i}^{l}$ computed by Eq. (1). Then, we sum over all ${\tilde{V}}_{i}^{l}$ and reach the output following a linear projection and a Softmax.

Fig. 2.

Computation graph for the loss function. The NN is trained on the theoretical data via the regression loss $L_{0}$ , and regularized by the experimental data via the prediction confidence regularization $L_{r e g}$ .

Fig. 3.

Comparison between the prediction of the NN and a reference answer from another experiment. Red squares are the prediction of NN, determined by local maxima of the $S$ layer generated by the NN. Black dots are the reference answer from another experiment, determined by the local maxima of the DoS of bound states shown as the color plot. Here, for training the NN, we use loss function $L_{0} + α L_{r e g}$ with α = 0.03 and training data with η = 0.8 Gaussian noise.

Fig. 4.

Compare the performance of different approaches on the experimental data. The curves are the ratio of the reference points perfectly agreed with prediction to the total reference points. The blue line and the purple line are given by NN trained by loss function $L_{0}$ , and the training data have no noise for the blue line and include noise for the purple line. The yellow line is given by NN trained by loss function $L_{0} + α L_{r e g}$ including confidence regularization and data without noise. Finally, the green line is given by NN trained by loss function $L_{0} + α L_{r e g}$ and data with noise. Here we take α = 0.03. The ratios are plotted as a function of the training epoch, and the results are averaged over ten independent training processes.

Fig. A1.

Experimental data on Au(111) surface. (a) Typical dI/dV spectra measured on defect (red curve) and on defect-free area (black curve). The abrupt increase of spectra above −520 mV shows the itinerant surface states on Au(111). Additional spectra on black curve indicates the bound state of defect below the surface band. (b) Differential conductance maps. Top panel, two representative dI/dV maps (at −20 mV and −175 mV) taken at 5 K with tunneling junction R = 5 GΩ, showing spatial oscillation from the quantum interference of surface states. Bottom, the conductance map at −520 mV, revealing the spatial distribution of the defect states.

[1]	Ziatdinov M, Maksov A, Li L, Sefat A, Maksymovych P, Kalinin S 2016 Nanotechnology 27 475706 DOI: 10.1088/0957-4484/27/47/475706
[2]	Torlai G, Timar B, van Nieuwenburg E P L, Harry L, Omran A, Keesling A, Bernien H, Greiner M, Vuletić V, Lukin M D, Melko R G, Endres M 2019 Phys. Rev. Lett. 123 230504 DOI: 10.1103/PhysRevLett.123.230504
[3]	Zhang Y, Mesaros A, Fujita K, Edkins SD, Hamidian MH, Ch’ng K, Eisaki H, Uchida S, Davis JCS, Khatami E, Kim EA 2019 Nature 570 484 DOI: 10.1038/s41586-019-1319-8
[4]	Rem B S, Käming N, Tarnowski M, Asteria L, Fläschner N, Becker C, Sengstock K, Weitenberg C 2019 Nat. Phys. 15 917 DOI: 10.1038/s41567-019-0554-0
[5]	Bohrdt A, Chiu C S, Ji G, Xu M, Greif D, Greiner M, Demler E, Grusdt F, Knap M 2019 Nat. Phys. 15 921 DOI: 10.1038/s41567-019-0565-x
[6]	Samarakoon A M, Barros K, Li Y W, Eisenbach M, Zhang Q, Ye F, Dun Z L, Zhou H, Grigera S A, Batista C D, Tennant D A 2020 Nat. Commun. 11 892 DOI: 10.1038/s41467-020-14660-y
[7]	Khatami E, Guardado-Sanchez E, Spar B M, Carrasquilla J F, Bakr W S, Scalettar R T 2020 arXiv:2002.12310 [cond-mat.str-el]
[8]	Bishop C M 2006 Pattern recognition and machine learning New York Springer 592 595
[9]	Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I 2017 arXiv:1706.03762 [cs.CL]
[10]	Devlin J, Chang M W, Lee K, Toutanova K 2018 arXiv:1810.04805 [cs.CL]
[11]	Chen W, Madhavan V, Jamneala T, Crommie M F 2018 Phys. Rev. Lett. 80 1469 DOI: 10.1103/PhysRevLett.80.1469

[1]	Meshfree-based physics-informed neural networks for the unsteady Oseen equations Keyi Peng(彭珂依), Jing Yue(岳靖), Wen Zhang(张文), and Jian Li(李剑). Chin. Phys. B, 2023, 32(4): 040208.
[2]	Diffraction deep neural network based orbital angular momentum mode recognition scheme in oceanic turbulence Hai-Chao Zhan(詹海潮), Bing Chen(陈兵), Yi-Xiang Peng(彭怡翔), Le Wang(王乐), Wen-Nai Wang(王文鼐), and Sheng-Mei Zhao(赵生妹). Chin. Phys. B, 2023, 32(4): 044208.
[3]	Super-resolution reconstruction algorithm for terahertz imaging below diffraction limit Ying Wang(王莹), Feng Qi(祁峰), Zi-Xu Zhang(张子旭), and Jin-Kuan Wang(汪晋宽). Chin. Phys. B, 2023, 32(3): 038702.
[4]	Atomistic insights into early stage corrosion of bcc Fe surfaces in oxygen dissolved liquid lead-bismuth eutectic (LBE-O) Ting Zhou(周婷), Xing Gao(高星), Zhiwei Ma(马志伟), Hailong Chang(常海龙), Tielong Shen(申铁龙), Minghuan Cui(崔明焕), and Zhiguang Wang(王志光). Chin. Phys. B, 2023, 32(3): 036801.
[5]	Inverse stochastic resonance in modular neural network with synaptic plasticity Yong-Tao Yu(于永涛) and Xiao-Li Yang(杨晓丽). Chin. Phys. B, 2023, 32(3): 030201.
[6]	Exploring fundamental laws of classical mechanics via predicting the orbits of planets based on neural networks Jian Zhang(张健), Yiming Liu(刘一鸣), and Zhanchun Tu(涂展春). Chin. Phys. B, 2022, 31(9): 094502.
[7]	Purification in entanglement distribution with deep quantum neural network Jin Xu(徐瑾), Xiaoguang Chen(陈晓光), Rong Zhang(张蓉), and Hanwei Xiao(肖晗微). Chin. Phys. B, 2022, 31(8): 080304.
[8]	Ionospheric vertical total electron content prediction model in low-latitude regions based on long short-term memory neural network Tong-Bao Zhang(张同宝), Hui-Jian Liang(梁慧剑),Shi-Guang Wang(王时光), and Chen-Guang Ouyang(欧阳晨光). Chin. Phys. B, 2022, 31(8): 080701.
[9]	Monolayer MoS₂ of high mobility grown on SiO₂ substrate by two-step chemical vapor deposition Jia-Jun Ma(马佳俊), Kang Wu(吴康), Zhen-Yu Wang(王振宇), Rui-Song Ma(马瑞松), Li-Hong Bao(鲍丽宏), Qing Dai(戴庆), Jin-Dong Ren(任金东), and Hong-Jun Gao(高鸿钧). Chin. Phys. B, 2022, 31(8): 088105.
[10]	Two-dimensional Sb cluster superlattice on Si substrate fabricated by a two-step method Runxiao Zhang(张润潇), Zi Liu(刘姿), Xin Hu(胡昕), Kun Xie(谢鹍), Xinyue Li(李新月), Yumin Xia(夏玉敏), and Shengyong Qin(秦胜勇). Chin. Phys. B, 2022, 31(8): 086801.
[11]	Hyperparameter on-line learning of stochastic resonance based threshold networks Weijin Li(李伟进), Yuhao Ren(任昱昊), and Fabing Duan(段法兵). Chin. Phys. B, 2022, 31(8): 080503.
[12]	Pulse coding off-chip learning algorithm for memristive artificial neural network Ming-Jian Guo(郭明健), Shu-Kai Duan(段书凯), and Li-Dan Wang(王丽丹). Chin. Phys. B, 2022, 31(7): 078702.
[13]	Experimental observation of pseudogap in a modulation-doped Mott insulator: Sn/Si(111)-(√30×√30)R30° Yan-Ling Xiong(熊艳翎), Jia-Qi Guan(关佳其), Rui-Feng Wang(汪瑞峰), Can-Li Song(宋灿立), Xu-Cun Ma(马旭村), and Qi-Kun Xue(薛其坤). Chin. Phys. B, 2022, 31(6): 067401.
[14]	Fast prediction of aerodynamic noise induced by the flow around a cylinder based on deep neural network Hai-Yang Meng(孟海洋), Zi-Xiang Xu(徐自翔), Jing Yang(杨京), Bin Liang(梁彬), and Jian-Chun Cheng(程建春). Chin. Phys. B, 2022, 31(6): 064305.
[15]	Digraph states and their neural network representations Ying Yang(杨莹) and Huaixin Cao(曹怀信). Chin. Phys. B, 2022, 31(6): 060303.

No Suggested Reading articles found!

Viewed

Full text

Abstract

Cited

Online attention

Altmetric

3 tweeters

Altmetric calculates a score based on the online attention an article receives. Each coloured thread in the circle represents a different type of online attention. The number in the centre is the Altmetric score. Social media and mainstream news media are the main sources that calculate the score. Reference managers such as Mendeley are also tracked but do not contribute to the score. Older articles often score higher because they have had more time to get noticed. To account for this, Altmetric has included the context data for other articles of a similar age.

View more on Altmetrics

Metrics
Related Articles