Implementation and data reproduction of Boltzmann generators
Owing to the incredibly small volume occupied by the equilibrium states in the configurational space, generating statistically independent samples of equilibrium states of many-body systems has been regarded as a significant challenge in the field of computational molecular science. This nature of equilibrium states necessitates the utilization of small steps in molecular simulations (e.g. molecular dynamics (MD) simulations or Monte Carlo (MC) simulations), making it nearly impossible to comprehensively sample the region of interest within reasonable amount of time. Combining deep learing models and statistical mechanics, Noé et al. proposed Boltzmann generators, in which this sampling issue was circumvented by repacking the high probability regions (i.e. equilibrium states) of configurational space into a concentrated region in latent space.
In my final project of the course Computational Statistical Physics (PHYS 7810, taught by Dr. Mattew Glaser), when the original project was still implemented using TensorFlow and Keras, my co-worker and I implemented Boltzmann generators from scratch using PyTorch. We applied the framework to different test systems and we were able to reproduce most of the data presented in the original paper published in Science. Before this project, I wasn’t even familiar with the simplest neural networks, but this project gave me a chance to really invest time and efforts to familirize myself with deep learning. This project also cultivated my interest in the field of applying deep learning to computational molecular science in general, which later became the field that I really want to dive into during my postdoc.