EFGs: A Complete and Accurate Implementation of Ertl’s Functional Group Detection Algorithm in RDKit
Functional groups are widely used in organic chemistry, because they provide a rationale to analyze physicochemical and reactivity properties. In medicinal chemistry, they are the basis for analyzing ligand–biomacromolecule interactions. Ertl’s algorithm is an approach to extract functional groups i...
Saved in:
| Main Author: | |
|---|---|
| Published: |
2025
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Functional groups are widely used in organic chemistry, because they provide a rationale to analyze physicochemical and reactivity properties. In medicinal chemistry, they are the basis for analyzing ligand–biomacromolecule interactions. Ertl’s algorithm is an approach to extract functional groups in arbitrary organic molecules that does not depend on predefined libraries of functional groups. However, there is a lack of a complete and accurate implementation of Ertl’s algorithm in the widely used RDKit cheminformatic toolkit. In this paper, a new RDKit/Python implementation of the algorithm is described, that is both accurate and complete. For a RDKit molecule, it provides (i) a PNG binary string with an image of the molecule with color-highlighted functional groups; (ii) a list of sets of atom indices (idx), each set corresponding to a functional group; (iii) a list of pseudo-SMILES canonicalized strings for the full functional groups; and (iv) a list of RDKit labeled mol objects, one for each full functional group. The code is freely available in https://github.com/bbu-imdea/efgs and is part of the RDKit Contrib directory (https://github.com/rdkit/rdkit/tree/master/Contrib/efgs). |
|---|