An evolutionary algorithm for interpretable molecular representations

Pflüger, Philipp M.; Kühnemund, Marius; Katzenburg, Felix; Kuchen, Herbert; Glorius, Frank


Zusammenfassung

Encoding molecular structures into a computer-readable, utilizable format is the key step for any machine learning application in all chemical sciences. Current representations vary strongly in complexity and shape, depending on the application. Therefore, the number of domain-specific representations is rapidly growing, with somebeing altered and retuned constantly. These tailored representations raise the barriers for entry and method adaption, thus decelerating progress in application. Herein, we present a general algorithm capable of yielding a highly specific representation solely based on a given dataset. The algorithm utilizes structural queries and evolutionary methodologies to generate interpretable molecular fingerprints. These are highly suited for molecular machine learning, enabling the accurate prediction of reactivity, property, and biological activity. We demonstrate its native interpretability, allowing for the extraction of knowledge, such as reactivity trends. We anticipate that the evolutionary multipattern fingerprint (EvoMPF) will be used to discover structure-target relationships in different molecular sciences.

Schlüsselwörter
molecular machine learning; molecular representation; evolutionary algorithm; reaction prediction; QAPR; explainable AI



Publikationstyp
Forschungsartikel (Zeitschrift)

Begutachtet
Ja

Publikationsstatus
Veröffentlicht

Jahr
2024

Fachzeitschrift
Chem

Band
10

Ausgabe
5

Erste Seite
1391

Letzte Seite
1405

Sprache
Englisch

ISSN
2694-2445

DOI