An evolutionary algorithm for interpretable molecular representations

Pflüger, Philipp M.; Kühnemund, Marius; Katzenburg, Felix; Kuchen, Herbert; Glorius, Frank


Abstract

Encoding molecular structures into a computer-readable, utilizable format is the key step for any machine learning application in all chemical sciences. Current representations vary strongly in complexity and shape, depending on the application. Therefore, the number of domain-specific representations is rapidly growing, with somebeing altered and retuned constantly. These tailored representations raise the barriers for entry and method adaption, thus decelerating progress in application. Herein, we present a general algorithm capable of yielding a highly specific representation solely based on a given dataset. The algorithm utilizes structural queries and evolutionary methodologies to generate interpretable molecular fingerprints. These are highly suited for molecular machine learning, enabling the accurate prediction of reactivity, property, and biological activity. We demonstrate its native interpretability, allowing for the extraction of knowledge, such as reactivity trends. We anticipate that the evolutionary multipattern fingerprint (EvoMPF) will be used to discover structure-target relationships in different molecular sciences.

Keywords
molecular machine learning; molecular representation; evolutionary algorithm; reaction prediction; QAPR; explainable AI



Publication type
Research article (journal)

Peer reviewed
Yes

Publication status
Published

Year
2024

Journal
Chem

Volume
10

Issue
5

Start page
1391

End page
1405

Language
English

ISSN
2694-2445

DOI