SPECIAL TOPIC — Machine learning in condensed matter physics
Traditional materials discovery is in ‘trial-and-error’ mode, leading to the issues of low-efficiency, high-cost, and unsustainability in materials design. Meanwhile, numerous experimental and computational trials accumulate enormous quantities of data with multi-dimensionality and complexity, which might bury critical ‘structure–properties’ rules yet unfortunately not well explored. Machine learning (ML), as a burgeoning approach in materials science, may dig out the hidden structure–properties relationship from materials bigdata, therefore, has recently garnered much attention in materials science. In this review, we try to shortly summarize recent research progress in this field, following the ML paradigm: (i) data acquisition → (ii) feature engineering → (iii) algorithm → (iv) ML model → (v) model evaluation → (vi) application. In section of application, we summarize recent work by following the ‘material science tetrahedron’: (i) structure and composition → (ii) property → (iii) synthesis → (iv) characterization, in order to reveal the quantitative structure–property relationship and provide inverse design countermeasures. In addition, the concurrent challenges encompassing data quality and quantity, model interpretability and generalizability, have also been discussed. This review intends to provide a preliminary overview of ML from basic algorithms to applications.
We train a neural network to identify impurities in the experimental images obtained by the scanning tunneling microscope (STM) measurements. The neural network is first trained with a large number of simulated data and then the trained neural network is applied to identify a set of experimental images taken at different voltages. We use the convolutional neural network to extract features from the images and also implement the attention mechanism to capture the correlations between images taken at different voltages. We note that the simulated data can capture the universal Friedel oscillation but cannot properly describe the non-universal physics short-range physics nearby an impurity, as well as noises in the experimental data. And we emphasize that the key of this approach is to properly deal with these differences between simulated data and experimental data. Here we show that even by including uncorrelated white noises in the simulated data, the performance of the neural network on experimental data can be significantly improved. To prevent the neural network from learning unphysical short-range physics, we also develop another method to evaluate the confidence of the neural network prediction on experimental data and to add this confidence measure into the loss function. We show that adding such an extra loss function can also improve the performance on experimental data. Our research can inspire future similar applications of machine learning on experimental data analysis.
In cluster science, it is challenging to identify the ground state structures (GSS) of gold (Au) clusters. Among different search approaches, first-principles method based on density functional theory (DFT) is the most reliable one with high precision. However, as the cluster size increases, it requires more expensive computational cost and becomes impracticable. In this paper, we have developed an artificial neural network (ANN) potential for Au clusters, which is trained to the DFT binding energies and forces of 9000 AuN clusters (11 ≤ N ≤ 100). The root mean square errors of energy and force are 13.4 meV/atom and 0.4 eV/Å, respectively. We demonstrate that the ANN potential has the capacity to differentiate the energy level of Au clusters and their isomers and highlight the need to further improve the accuracy. Given its excellent transferability, we emphasis that ANN potential is a promising tool to breakthrough computational bottleneck of DFT method and effectively accelerate the pre-screening of Au clusters’ GSS.