Skip to main content
Efficient embedding of Machine Learning potentials for biomolecular simulations

Kirill Zinovjev (UV) — Efficient embedding of Machine Learning potentials for biomolecular simulations

Don't miss any Success Story following us on X and LinkedIn! 
@RES_HPC   RES - Red Española de Supercomputación

Check this Success Story at our LinkedIn: Efficient embedding of Machine Learning potentials for biomolecular simulations

💡 A new RES Success Story about improving efficiency in computational chemistry simulations💡

📋 "Efficient embedding of Machine Learning potentials for biomolecular simulations" led by Kirill Zinovjev

Simulating molecules in complex environments, such as solvents, membranes or protein binding sites, is a fundamental challenge in computational chemistry due to the enormous size of the simulation systems, which can easily reach 100,000 atoms.

Traditional quantum mechanics/molecular mechanics (QM/MM) and Density Functional Theory (DFT) approaches allow to simulate these systems by treating most of the atoms with a simple model, while applying a more accurate QM description to a small part of the system where high accuracy is required. This approach is often limited by high computational costs, creating the necessity for new methods that can achieve similar accuracy with a significantly lower computational cost.

The research team, in collaboration with Marc van der Kamp's team and the ACRC Research Software team from University of Bristol, developed the EMLE-engine package that implements a novel electrostatic machine learning embedding (EMLE) model. This model replaces QM methods with Machine Learning (ML) in QM/MM approaches, significantly reducing computational effort without sacrificing precision. 

🖥️ Thanks to RES Supercomputers hashtagTirant and hashtagMareNostrum5 from Barcelona Supercomputing Center and Universitat de València, the team could train the EMLE model to simulate alanine dipeptide, a benchmark system in computational chemistry, and perform traditional QM/MM to the same system to compare and evaluate the accuracy of the new model. Availability of both CPU and GPU resources was essential to simulate both systems in order to demonstrate the efficiency gains exhibited by the ML-enhanced simulations.


 

📸The left panel shows the idea of EMLE as the layer between an isolated molecule described by a ML potential and the solvent described by a cheap forcefield. 

📸The right panel shows the free energy landscape of the alanine dipeptide obtained by DFT/MM and EMLE/MM. Both landscapes are virtually indistinguishable, while the computational cost is 4 orders of magnitude lower (3min/step VS 15ms/step) with the EMLE model. 

👉 You can check the publication and the github page of the model for more info: 
https://lnkd.in/dePhVc8R
https://lnkd.in/d2h3iMkZ