PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks
This study introduces PROFIS, a new generative model capable of the design of structurally novel and target-focused compound libraries. The model relies on a recurrent neural network that was trained to decode embedded molecular fingerprints into SMILES strings. To identify potential novel ligands,...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | , |
| Published: |
2025
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1852020999687503872 |
|---|---|
| author | Hubert Rybka (21190469) |
| author2 | Tomasz Danel (15875368) Sabina Podlewska (3750124) |
| author2_role | author author |
| author_facet | Hubert Rybka (21190469) Tomasz Danel (15875368) Sabina Podlewska (3750124) |
| author_role | author |
| dc.creator.none.fl_str_mv | Hubert Rybka (21190469) Tomasz Danel (15875368) Sabina Podlewska (3750124) |
| dc.date.none.fl_str_mv | 2025-04-28T11:37:37Z |
| dc.identifier.none.fl_str_mv | 10.1021/acs.jcim.5c00698.s005 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/dataset/PROFIS_Design_of_Target-Focused_Libraries_by_Probing_Continuous_Fingerprint_Space_with_Recurrent_Neural_Networks/28882124 |
| dc.rights.none.fl_str_mv | CC BY-NC 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Biochemistry Molecular Biology Pharmacology Biotechnology Science Policy Biological Sciences not elsewhere classified Chemical Sciences not elsewhere classified Information Systems not elsewhere classified recurrent neural networks generate candidate ligands designing ligands outside bayesian optimization algorithm generate diverse libraries recurrent neural network focused compound libraries study introduces profis biological activity predictor 2 </ sub given drug target focused libraries profis network drug molecule biological target activity subspaces worth noting widespread use structurally novel selfies strings scripts shared paper demonstrates output notation model relies latent representations decode deepsmiles also emphasizes |
| dc.title.none.fl_str_mv | PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks |
| dc.type.none.fl_str_mv | Dataset info:eu-repo/semantics/publishedVersion dataset |
| description | This study introduces PROFIS, a new generative model capable of the design of structurally novel and target-focused compound libraries. The model relies on a recurrent neural network that was trained to decode embedded molecular fingerprints into SMILES strings. To identify potential novel ligands, a biological activity predictor is first trained on the low-dimensional fingerprint embedding space, enabling the identification of high-activity subspaces for a given drug target. The search for latent representations that are expected to yield active structures upon decoding to SMILES is conducted with a Bayesian optimization algorithm. We present the rationale for using SMILES as the output notation of the recurrent neural network and compare its performance with models trained to decode DeepSMILES and SELFIES strings. The paper demonstrates the application of this protocol to generate candidate ligands of the dopamine D<sub>2</sub> receptor. It also emphasizes the effectiveness of our approach in scaffold-hopping, which is valuable for designing ligands outside the already explored chemical space. We present how passing engineered molecular fingerprints through PROFIS network can be utilized to generate diverse libraries of analogs for a drug molecule of choice. It is worth noting that the protocol is versatile and it can be employed for any biological target, given the availability of a dataset containing known ligands. The potential for widespread use of PROFIS is secured by scripts shared by the authors on GitHub. |
| eu_rights_str_mv | openAccess |
| id | Manara_c7969a072c835caba2d2df509ea1033a |
| identifier_str_mv | 10.1021/acs.jcim.5c00698.s005 |
| network_acronym_str | Manara |
| network_name_str | ManaraRepo |
| oai_identifier_str | oai:figshare.com:article/28882124 |
| publishDate | 2025 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY-NC 4.0 |
| spelling | PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural NetworksHubert Rybka (21190469)Tomasz Danel (15875368)Sabina Podlewska (3750124)BiochemistryMolecular BiologyPharmacologyBiotechnologyScience PolicyBiological Sciences not elsewhere classifiedChemical Sciences not elsewhere classifiedInformation Systems not elsewhere classifiedrecurrent neural networksgenerate candidate ligandsdesigning ligands outsidebayesian optimization algorithmgenerate diverse librariesrecurrent neural networkfocused compound librariesstudy introduces profisbiological activity predictor2 </ subgiven drug targetfocused librariesprofis networkdrug moleculebiological targetactivity subspacesworth notingwidespread usestructurally novelselfies stringsscripts sharedpaper demonstratesoutput notationmodel relieslatent representationsdecode deepsmilesalso emphasizesThis study introduces PROFIS, a new generative model capable of the design of structurally novel and target-focused compound libraries. The model relies on a recurrent neural network that was trained to decode embedded molecular fingerprints into SMILES strings. To identify potential novel ligands, a biological activity predictor is first trained on the low-dimensional fingerprint embedding space, enabling the identification of high-activity subspaces for a given drug target. The search for latent representations that are expected to yield active structures upon decoding to SMILES is conducted with a Bayesian optimization algorithm. We present the rationale for using SMILES as the output notation of the recurrent neural network and compare its performance with models trained to decode DeepSMILES and SELFIES strings. The paper demonstrates the application of this protocol to generate candidate ligands of the dopamine D<sub>2</sub> receptor. It also emphasizes the effectiveness of our approach in scaffold-hopping, which is valuable for designing ligands outside the already explored chemical space. We present how passing engineered molecular fingerprints through PROFIS network can be utilized to generate diverse libraries of analogs for a drug molecule of choice. It is worth noting that the protocol is versatile and it can be employed for any biological target, given the availability of a dataset containing known ligands. The potential for widespread use of PROFIS is secured by scripts shared by the authors on GitHub.2025-04-28T11:37:37ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.1021/acs.jcim.5c00698.s005https://figshare.com/articles/dataset/PROFIS_Design_of_Target-Focused_Libraries_by_Probing_Continuous_Fingerprint_Space_with_Recurrent_Neural_Networks/28882124CC BY-NC 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/288821242025-04-28T11:37:37Z |
| spellingShingle | PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks Hubert Rybka (21190469) Biochemistry Molecular Biology Pharmacology Biotechnology Science Policy Biological Sciences not elsewhere classified Chemical Sciences not elsewhere classified Information Systems not elsewhere classified recurrent neural networks generate candidate ligands designing ligands outside bayesian optimization algorithm generate diverse libraries recurrent neural network focused compound libraries study introduces profis biological activity predictor 2 </ sub given drug target focused libraries profis network drug molecule biological target activity subspaces worth noting widespread use structurally novel selfies strings scripts shared paper demonstrates output notation model relies latent representations decode deepsmiles also emphasizes |
| status_str | publishedVersion |
| title | PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks |
| title_full | PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks |
| title_fullStr | PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks |
| title_full_unstemmed | PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks |
| title_short | PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks |
| title_sort | PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks |
| topic | Biochemistry Molecular Biology Pharmacology Biotechnology Science Policy Biological Sciences not elsewhere classified Chemical Sciences not elsewhere classified Information Systems not elsewhere classified recurrent neural networks generate candidate ligands designing ligands outside bayesian optimization algorithm generate diverse libraries recurrent neural network focused compound libraries study introduces profis biological activity predictor 2 </ sub given drug target focused libraries profis network drug molecule biological target activity subspaces worth noting widespread use structurally novel selfies strings scripts shared paper demonstrates output notation model relies latent representations decode deepsmiles also emphasizes |