Benchmarking Cross-Docking Strategies in Kinase Drug Discovery
In recent years, machine learning has transformed many aspects of the drug discovery process, including small molecule design, for which the prediction of bioactivity is an integral part. Leveraging structural information about the interactions between a small molecule and its protein target has gre...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | , , |
| Published: |
2024
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1852025118256005120 |
|---|---|
| author | David A. Schaller (20288030) |
| author2 | Clara D. Christ (2632918) John D. Chodera (1323594) Andrea Volkamer (1444000) |
| author2_role | author author author |
| author_facet | David A. Schaller (20288030) Clara D. Christ (2632918) John D. Chodera (1323594) Andrea Volkamer (1444000) |
| author_role | author |
| dc.creator.none.fl_str_mv | David A. Schaller (20288030) Clara D. Christ (2632918) John D. Chodera (1323594) Andrea Volkamer (1444000) |
| dc.date.none.fl_str_mv | 2024-11-19T07:09:47Z |
| dc.identifier.none.fl_str_mv | 10.1021/acs.jcim.4c00905.s002 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/dataset/Benchmarking_Cross-Docking_Strategies_in_Kinase_Drug_Discovery/27852113 |
| dc.rights.none.fl_str_mv | CC BY-NC 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Biophysics Biochemistry Genetics Pharmacology Biotechnology Immunology Biological Sciences not elsewhere classified Chemical Sciences not elsewhere classified Information Systems not elsewhere classified transformed many aspects reproduce binding poses recovering binding poses maximum common substructure leveraging structural information inhibitor complex geometries finding practical approaches drug discovery process utilizing shape overlap kinase drug discovery based docking alone pose selection strategies docking methods biased generating useful kinase docking utilizing docking strategies docking pose three methods studied docking docking scenario success rate standard physics square deviation small molecule recent years realistic cross protein target protein kinases protein families openeye toolkits machine learning low root kinoml framework integral part included systems great potential general findings fundamentally limited efficient way different classes competitive ligands cocrystallized ligand benchmarking cross automated fashion although focused allowing automated 423 atp |
| dc.title.none.fl_str_mv | Benchmarking Cross-Docking Strategies in Kinase Drug Discovery |
| dc.type.none.fl_str_mv | Dataset info:eu-repo/semantics/publishedVersion dataset |
| description | In recent years, machine learning has transformed many aspects of the drug discovery process, including small molecule design, for which the prediction of bioactivity is an integral part. Leveraging structural information about the interactions between a small molecule and its protein target has great potential for downstream machine learning scoring approaches but is fundamentally limited by the accuracy with which protein–ligand complex structures can be predicted in a reliable and automated fashion. With the goal of finding practical approaches to generating useful kinase-inhibitor complex geometries for downstream machine learning scoring approaches, we present a kinase-centric docking benchmark assessing the performance of different classes of docking and pose selection strategies to assess how well experimentally observed binding modes are recapitulated in a realistic cross-docking scenario. The assembled benchmark data set focuses on the well-studied protein kinase family and comprises a subset of 589 protein structures cocrystallized with 423 ATP-competitive ligands. We find that the docking methods biased by the cocrystallized ligand, utilizing shape overlap with or without maximum common substructure matching, are more successful in recovering binding poses than standard physics-based docking alone. Also, docking into multiple structures significantly increases the chance of generating a low root-mean-square deviation (RMSD) docking pose. Docking utilizing an approach that combines all three methods (Posit) into structures with the most similar cocrystallized ligands according to the maximum common substructure (MCS) proved to be the most efficient way to reproduce binding poses, achieving a success rate of 70.4% across all included systems. The studied docking and pose selection strategies, which utilize the OpenEye Toolkits, were implemented into pipelines of the KinoML framework, allowing automated and reliable protein–ligand complex generation for future downstream machine learning tasks. Although focused on protein kinases, we believe that the general findings can also be transferred to other protein families. |
| eu_rights_str_mv | openAccess |
| id | Manara_594faad9fbd7f1a3c4e2bb2fb44b7df5 |
| identifier_str_mv | 10.1021/acs.jcim.4c00905.s002 |
| network_acronym_str | Manara |
| network_name_str | ManaraRepo |
| oai_identifier_str | oai:figshare.com:article/27852113 |
| publishDate | 2024 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY-NC 4.0 |
| spelling | Benchmarking Cross-Docking Strategies in Kinase Drug DiscoveryDavid A. Schaller (20288030)Clara D. Christ (2632918)John D. Chodera (1323594)Andrea Volkamer (1444000)BiophysicsBiochemistryGeneticsPharmacologyBiotechnologyImmunologyBiological Sciences not elsewhere classifiedChemical Sciences not elsewhere classifiedInformation Systems not elsewhere classifiedtransformed many aspectsreproduce binding posesrecovering binding posesmaximum common substructureleveraging structural informationinhibitor complex geometriesfinding practical approachesdrug discovery processutilizing shape overlapkinase drug discoverybased docking alonepose selection strategiesdocking methods biasedgenerating useful kinasedocking utilizingdocking strategiesdocking posethree methodsstudied dockingdocking scenariosuccess ratestandard physicssquare deviationsmall moleculerecent yearsrealistic crossprotein targetprotein kinasesprotein familiesopeneye toolkitsmachine learninglow rootkinoml frameworkintegral partincluded systemsgreat potentialgeneral findingsfundamentally limitedefficient waydifferent classescompetitive ligandscocrystallized ligandbenchmarking crossautomated fashionalthough focusedallowing automated423 atpIn recent years, machine learning has transformed many aspects of the drug discovery process, including small molecule design, for which the prediction of bioactivity is an integral part. Leveraging structural information about the interactions between a small molecule and its protein target has great potential for downstream machine learning scoring approaches but is fundamentally limited by the accuracy with which protein–ligand complex structures can be predicted in a reliable and automated fashion. With the goal of finding practical approaches to generating useful kinase-inhibitor complex geometries for downstream machine learning scoring approaches, we present a kinase-centric docking benchmark assessing the performance of different classes of docking and pose selection strategies to assess how well experimentally observed binding modes are recapitulated in a realistic cross-docking scenario. The assembled benchmark data set focuses on the well-studied protein kinase family and comprises a subset of 589 protein structures cocrystallized with 423 ATP-competitive ligands. We find that the docking methods biased by the cocrystallized ligand, utilizing shape overlap with or without maximum common substructure matching, are more successful in recovering binding poses than standard physics-based docking alone. Also, docking into multiple structures significantly increases the chance of generating a low root-mean-square deviation (RMSD) docking pose. Docking utilizing an approach that combines all three methods (Posit) into structures with the most similar cocrystallized ligands according to the maximum common substructure (MCS) proved to be the most efficient way to reproduce binding poses, achieving a success rate of 70.4% across all included systems. The studied docking and pose selection strategies, which utilize the OpenEye Toolkits, were implemented into pipelines of the KinoML framework, allowing automated and reliable protein–ligand complex generation for future downstream machine learning tasks. Although focused on protein kinases, we believe that the general findings can also be transferred to other protein families.2024-11-19T07:09:47ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.1021/acs.jcim.4c00905.s002https://figshare.com/articles/dataset/Benchmarking_Cross-Docking_Strategies_in_Kinase_Drug_Discovery/27852113CC BY-NC 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/278521132024-11-19T07:09:47Z |
| spellingShingle | Benchmarking Cross-Docking Strategies in Kinase Drug Discovery David A. Schaller (20288030) Biophysics Biochemistry Genetics Pharmacology Biotechnology Immunology Biological Sciences not elsewhere classified Chemical Sciences not elsewhere classified Information Systems not elsewhere classified transformed many aspects reproduce binding poses recovering binding poses maximum common substructure leveraging structural information inhibitor complex geometries finding practical approaches drug discovery process utilizing shape overlap kinase drug discovery based docking alone pose selection strategies docking methods biased generating useful kinase docking utilizing docking strategies docking pose three methods studied docking docking scenario success rate standard physics square deviation small molecule recent years realistic cross protein target protein kinases protein families openeye toolkits machine learning low root kinoml framework integral part included systems great potential general findings fundamentally limited efficient way different classes competitive ligands cocrystallized ligand benchmarking cross automated fashion although focused allowing automated 423 atp |
| status_str | publishedVersion |
| title | Benchmarking Cross-Docking Strategies in Kinase Drug Discovery |
| title_full | Benchmarking Cross-Docking Strategies in Kinase Drug Discovery |
| title_fullStr | Benchmarking Cross-Docking Strategies in Kinase Drug Discovery |
| title_full_unstemmed | Benchmarking Cross-Docking Strategies in Kinase Drug Discovery |
| title_short | Benchmarking Cross-Docking Strategies in Kinase Drug Discovery |
| title_sort | Benchmarking Cross-Docking Strategies in Kinase Drug Discovery |
| topic | Biophysics Biochemistry Genetics Pharmacology Biotechnology Immunology Biological Sciences not elsewhere classified Chemical Sciences not elsewhere classified Information Systems not elsewhere classified transformed many aspects reproduce binding poses recovering binding poses maximum common substructure leveraging structural information inhibitor complex geometries finding practical approaches drug discovery process utilizing shape overlap kinase drug discovery based docking alone pose selection strategies docking methods biased generating useful kinase docking utilizing docking strategies docking pose three methods studied docking docking scenario success rate standard physics square deviation small molecule recent years realistic cross protein target protein kinases protein families openeye toolkits machine learning low root kinoml framework integral part included systems great potential general findings fundamentally limited efficient way different classes competitive ligands cocrystallized ligand benchmarking cross automated fashion although focused allowing automated 423 atp |