Computational Laboratory for Energy And Nanoscience

University Homepage | Department of Physics
University Homepage | Department of Electrical and Computer Engineering
Map it | City of Ottawa | Regional News | Local Weather | Government of Canada

Manuscript Summary

Generative Enriched Sequential Learning (ESL) Approach for Molecular Design via Augmented Domain Knowledge

M. S. Ghaemi, K. Grantham, I. Tamblyn, Y. Li, H.K. Ooi

Canadian AI (2022)

Generative machine learning models for molecular design often struggle to produce molecules with desirable chemical properties because they focus primarily on learning the statistical distribution of molecular structures without incorporating domain-specific knowledge. In this work, we introduce a Generative Enriched Sequential Learning (ESL) approach that augments traditional sequential learning models -- including Hidden Markov Models, Recurrent Neural Networks, and Long Short-Term Memory networks -- with domain knowledge such as quantitative estimates of drug-likeness scores (QEDs). By incorporating these supervised chemical property metrics directly into the training process, the ESL method enables the model to learn specific patterns of particular interest more effectively, leading to the generation of de novo molecules with improved drug-likeness properties. This approach addresses the bias that standard generative models have toward prevalent molecules in training data, instead steering the generation process toward molecules with greater practical relevance for drug discovery.



Journal Link | Open Access Link

UOIT uOttawa uWaterloo UOIT