1 Institute for Advanced Study, Tsinghua University, Beijing 100084, China 2 State Key Laboratory of Low Dimensional Quantum Physics, Department of Physics, Tsinghua University, Beijing 100084, China 3 Department of Physics and Beijing Key Laboratory of Opto-electronic Functional Materials and Micro-nano Devices, Renmin University of China, Beijing 100872, China 4 Frontier Science Center for Quantum Information, Beijing 100084, China 5 Department of Physics, University of California, San Diego, California 92093, USA
We train a neural network to identify impurities in the experimental images obtained by the scanning tunneling microscope (STM) measurements. The neural network is first trained with a large number of simulated data and then the trained neural network is applied to identify a set of experimental images taken at different voltages. We use the convolutional neural network to extract features from the images and also implement the attention mechanism to capture the correlations between images taken at different voltages. We note that the simulated data can capture the universal Friedel oscillation but cannot properly describe the non-universal physics short-range physics nearby an impurity, as well as noises in the experimental data. And we emphasize that the key of this approach is to properly deal with these differences between simulated data and experimental data. Here we show that even by including uncorrelated white noises in the simulated data, the performance of the neural network on experimental data can be significantly improved. To prevent the neural network from learning unphysical short-range physics, we also develop another method to evaluate the confidence of the neural network prediction on experimental data and to add this confidence measure into the loss function. We show that adding such an extra loss function can also improve the performance on experimental data. Our research can inspire future similar applications of machine learning on experimental data analysis.
Received: 02 July 2020
Revised: 07 August 2020
Accepted manuscript online: 14 October 2020
Fund: HZ is supported by Beijing Outstanding Scholar Program, the National Key Research and Development Program of China (Grant No. 2016YFA0301600), and the National Natural Science Foundation of China (Grant No. 11734010). YZY is supported by a startup fund from UCSD. PC is supported by the Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China.
Ce Wang(王策), Haiwei Li(李海威), Zhenqi Hao(郝镇齐), Xintong Li(李昕彤), Changwei Zou(邹昌炜), Peng Cai(蔡鹏), Yayu Wang(王亚愚), Yi-Zhuang You(尤亦庄), and Hui Zhai(翟荟) Machine learning identification of impurities in the STM images 2020 Chin. Phys. B 29 116805
Fig. 1.
The structure of the NN. The voltage arrow indicates a sequence of images taken at different voltages. Each image is mapped to tensors of the query , the key , and the value . By the attention mechanism, it leads to normalized value computed by Eq. (1). Then, we sum over all and reach the output following a linear projection and a Softmax.
Fig. 2.
Computation graph for the loss function. The NN is trained on the theoretical data via the regression loss , and regularized by the experimental data via the prediction confidence regularization .
Fig. 3.
Comparison between the prediction of the NN and a reference answer from another experiment. Red squares are the prediction of NN, determined by local maxima of the layer generated by the NN. Black dots are the reference answer from another experiment, determined by the local maxima of the DoS of bound states shown as the color plot. Here, for training the NN, we use loss function with α = 0.03 and training data with η = 0.8 Gaussian noise.
Fig. 4.
Compare the performance of different approaches on the experimental data. The curves are the ratio of the reference points perfectly agreed with prediction to the total reference points. The blue line and the purple line are given by NN trained by loss function , and the training data have no noise for the blue line and include noise for the purple line. The yellow line is given by NN trained by loss function including confidence regularization and data without noise. Finally, the green line is given by NN trained by loss function and data with noise. Here we take α = 0.03. The ratios are plotted as a function of the training epoch, and the results are averaged over ten independent training processes.
Fig. A1.
Experimental data on Au(111) surface. (a) Typical dI/dV spectra measured on defect (red curve) and on defect-free area (black curve). The abrupt increase of spectra above −520 mV shows the itinerant surface states on Au(111). Additional spectra on black curve indicates the bound state of defect below the surface band. (b) Differential conductance maps. Top panel, two representative dI/dV maps (at −20 mV and −175 mV) taken at 5 K with tunneling junction R = 5 GΩ, showing spatial oscillation from the quantum interference of surface states. Bottom, the conductance map at −520 mV, revealing the spatial distribution of the defect states.
[1]
Ziatdinov M, Maksov A, Li L, Sefat A, Maksymovych P, Kalinin S 2016 Nanotechnology27 475706 DOI: 10.1088/0957-4484/27/47/475706
[2]
Torlai G, Timar B, van Nieuwenburg E P L, Harry L, Omran A, Keesling A, Bernien H, Greiner M, Vuletić V, Lukin M D, Melko R G, Endres M 2019 Phys. Rev. Lett.123 230504 DOI: 10.1103/PhysRevLett.123.230504
[3]
Zhang Y, Mesaros A, Fujita K, Edkins SD, Hamidian MH, Ch’ng K, Eisaki H, Uchida S, Davis JCS, Khatami E, Kim EA 2019 Nature570 484 DOI: 10.1038/s41586-019-1319-8
[4]
Rem B S, Käming N, Tarnowski M, Asteria L, Fläschner N, Becker C, Sengstock K, Weitenberg C 2019 Nat. Phys.15 917 DOI: 10.1038/s41567-019-0554-0
[5]
Bohrdt A, Chiu C S, Ji G, Xu M, Greif D, Greiner M, Demler E, Grusdt F, Knap M 2019 Nat. Phys.15 921 DOI: 10.1038/s41567-019-0565-x
[6]
Samarakoon A M, Barros K, Li Y W, Eisenbach M, Zhang Q, Ye F, Dun Z L, Zhou H, Grigera S A, Batista C D, Tennant D A 2020 Nat. Commun.11 892 DOI: 10.1038/s41467-020-14660-y
[7]
Khatami E, Guardado-Sanchez E, Spar B M, Carrasquilla J F, Bakr W S, Scalettar R T 2020 arXiv:2002.12310 [cond-mat.str-el]
[8]
Bishop C M 2006 Pattern recognition and machine learning New York Springer 592 595
[9]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I 2017 arXiv:1706.03762 [cs.CL]
[10]
Devlin J, Chang M W, Lee K, Toutanova K 2018 arXiv:1810.04805 [cs.CL]
Altmetric calculates a score based on the online attention an article receives. Each coloured thread in the circle represents a different type of online attention. The number in the centre is the Altmetric score. Social media and mainstream news media are the main sources that calculate the score. Reference managers such as Mendeley are also tracked but do not contribute to the score. Older articles often score higher because they have had more time to get noticed. To account for this, Altmetric has included the context data for other articles of a similar age.