SPECIAL TOPIC — Machine learning in statistical physics

    Default Latest Most Read
    Please wait a minute...
    For selected: Toggle thumbnails
    Restricted Boltzmann machine: Recent advances and mean-field theory
    Aurélien Decelle, Cyril Furtlehner
    Chin. Phys. B, 2021, 30 (4): 040202.   DOI: 10.1088/1674-1056/abd160
    Abstract657)   HTML23)    PDF (4380KB)(394)      

    This review deals with restricted Boltzmann machine (RBM) under the light of statistical physics. The RBM is a classical family of machine learning (ML) models which played a central role in the development of deep learning. Viewing it as a spin glass model and exhibiting various links with other models of statistical physics, we gather recent results dealing with mean-field theory in this context. First the functioning of the RBM can be analyzed via the phase diagrams obtained for various statistical ensembles of RBM, leading in particular to identify a compositional phase where a small number of features or modes are combined to form complex patterns. Then we discuss recent works either able to devise mean-field based learning algorithms; either able to reproduce generic aspects of the learning process from some ensemble dynamics equations or/and from linear stability arguments.

    Inverse Ising techniques to infer underlying mechanisms from data
    Hong-Li Zeng(曾红丽), Erik Aurell
    Chin. Phys. B, 2020, 29 (8): 080201.   DOI: 10.1088/1674-1056/ab8da6
    Abstract633)   HTML    PDF (2109KB)(262)      

    As a problem in data science the inverse Ising (or Potts) problem is to infer the parameters of a Gibbs-Boltzmann distributions of an Ising (or Potts) model from samples drawn from that distribution. The algorithmic and computational interest stems from the fact that this inference task cannot be carried out efficiently by the maximum likelihood criterion, since the normalizing constant of the distribution (the partition function) cannot be calculated exactly and efficiently. The practical interest on the other hand flows from several outstanding applications, of which the most well known has been predicting spatial contacts in protein structures from tables of homologous protein sequences. Most applications to date have been to data that has been produced by a dynamical process which, as far as it is known, cannot be expected to satisfy detailed balance. There is therefore no a priori reason to expect the distribution to be of the Gibbs-Boltzmann type, and no a priori reason to expect that inverse Ising (or Potts) techniques should yield useful information. In this review we discuss two types of problems where progress nevertheless can be made. We find that depending on model parameters there are phases where, in fact, the distribution is close to Gibbs-Boltzmann distribution, a non-equilibrium nature of the under-lying dynamics notwithstanding. We also discuss the relation between inferred Ising model parameters and parameters of the underlying dynamics.

    Relationship between manifold smoothness and adversarial vulnerability in deep learning with local errors
    Zijian Jiang(蒋子健), Jianwen Zhou(周健文), and Haiping Huang(黄海平)
    Chin. Phys. B, 2021, 30 (4): 048702.   DOI: 10.1088/1674-1056/abd68e
    Abstract455)   HTML12)    PDF (1573KB)(105)      
    Artificial neural networks can achieve impressive performances, and even outperform humans in some specific tasks. Nevertheless, unlike biological brains, the artificial neural networks suffer from tiny perturbations in sensory input, under various kinds of adversarial attacks. It is therefore necessary to study the origin of the adversarial vulnerability. Here, we establish a fundamental relationship between geometry of hidden representations (manifold perspective) and the generalization capability of the deep networks. For this purpose, we choose a deep neural network trained by local errors, and then analyze emergent properties of the trained networks through the manifold dimensionality, manifold smoothness, and the generalization capability. To explore effects of adversarial examples, we consider independent Gaussian noise attacks and fast-gradient-sign-method (FGSM) attacks. Our study reveals that a high generalization accuracy requires a relatively fast power-law decay of the eigen-spectrum of hidden representations. Under Gaussian attacks, the relationship between generalization accuracy and power-law exponent is monotonic, while a non-monotonic behavior is observed for FGSM attacks. Our empirical study provides a route towards a final mechanistic interpretation of adversarial vulnerability under adversarial attacks.