BEGAN: Boltzmann-Reweighted Data Augmentation for Enhanced GAN-Based Molecule Design in Insect Pheromone Receptors

Identifying small molecules that bind strongly to target proteins in rational molecular design is crucial. Machine learning techniques, such as generative adversarial networks (GAN), are now essential tools for generating such molecules. In this study, we present an enhanced method for molecule gene...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Jialei Dai (13020462) (author)
مؤلفون آخرون: Yutong Zhang (4844703) (author), Chen Shi (415895) (author), Yang Liu (4829) (author), Peng Xiu (543747) (author), Yong Wang (12837) (author)
منشور في: 2024
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1852025029904039936
author Jialei Dai (13020462)
author2 Yutong Zhang (4844703)
Chen Shi (415895)
Yang Liu (4829)
Peng Xiu (543747)
Yong Wang (12837)
author2_role author
author
author
author
author
author_facet Jialei Dai (13020462)
Yutong Zhang (4844703)
Chen Shi (415895)
Yang Liu (4829)
Peng Xiu (543747)
Yong Wang (12837)
author_role author
dc.creator.none.fl_str_mv Jialei Dai (13020462)
Yutong Zhang (4844703)
Chen Shi (415895)
Yang Liu (4829)
Peng Xiu (543747)
Yong Wang (12837)
dc.date.none.fl_str_mv 2024-11-21T16:03:45Z
dc.identifier.none.fl_str_mv 10.1021/acs.jpcb.4c06729.s004
dc.relation.none.fl_str_mv https://figshare.com/articles/dataset/BEGAN_Boltzmann-Reweighted_Data_Augmentation_for_Enhanced_GAN-Based_Molecule_Design_in_Insect_Pheromone_Receptors/27728222
dc.rights.none.fl_str_mv CC BY-NC 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Biophysics
Biochemistry
Genetics
Molecular Biology
Neuroscience
Pharmacology
Immunology
Computational Biology
Biological Sciences not elsewhere classified
Chemical Sciences not elsewhere classified
Information Systems not elsewhere classified
superior binding properties
reweighted data augmentation
overall distribution shape
optimizing molecular generation
molecular discovery pipelines
machine learning techniques
improved binding affinities
higher binding affinities
generative adversarial networks
explore molecular spaces
related scaling hyperparameter
enhanced protein binding
rational molecular design
based molecule design
>/ τ ),
enhanced gan ),
gan ),
training based
hyperparameter τ
enhanced method
enhanced gan
u </
target proteins
reweighting process
reinforced gans
reasonable range
potentially increasing
parameter dependencies
method offers
essential tools
docking algorithms
comprehensive investigation
bind strongly
dc.title.none.fl_str_mv BEGAN: Boltzmann-Reweighted Data Augmentation for Enhanced GAN-Based Molecule Design in Insect Pheromone Receptors
dc.type.none.fl_str_mv Dataset
info:eu-repo/semantics/publishedVersion
dataset
description Identifying small molecules that bind strongly to target proteins in rational molecular design is crucial. Machine learning techniques, such as generative adversarial networks (GAN), are now essential tools for generating such molecules. In this study, we present an enhanced method for molecule generation using objective-reinforced GANs. Specifically, we introduce BEGAN (Boltzmann-enhanced GAN), a novel approach that adjusts molecule occurrence frequencies during training based on the Boltzmann distribution exp­(−Δ<i>U</i>/τ), where Δ<i>U</i> represents the estimated binding free energy derived from docking algorithms and τ is a temperature-related scaling hyperparameter. This Boltzmann reweighting process shifts the generation process toward molecules with higher binding affinities, allowing the GAN to explore molecular spaces with superior binding properties. The reweighting process can also be refined through multiple iterations without altering the overall distribution shape. To validate our approach, we apply it to the design of sex pheromone analogs targeting Spodoptera frugiperda pheromone receptor SfruOR16, illustrating that the Boltzmann reweighting significantly increases the likelihood of generating promising sex pheromone analogs with improved binding affinities to SfruOR16, further supported by atomistic molecular dynamics simulations. Furthermore, we conduct a comprehensive investigation into parameter dependencies and propose a reasonable range for the hyperparameter τ. Our method offers a promising approach for optimizing molecular generation for enhanced protein binding, potentially increasing the efficiency of molecular discovery pipelines.
eu_rights_str_mv openAccess
id Manara_8740c6f086f68896f88c3152b00aaf76
identifier_str_mv 10.1021/acs.jpcb.4c06729.s004
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/27728222
publishDate 2024
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY-NC 4.0
spelling BEGAN: Boltzmann-Reweighted Data Augmentation for Enhanced GAN-Based Molecule Design in Insect Pheromone ReceptorsJialei Dai (13020462)Yutong Zhang (4844703)Chen Shi (415895)Yang Liu (4829)Peng Xiu (543747)Yong Wang (12837)BiophysicsBiochemistryGeneticsMolecular BiologyNeurosciencePharmacologyImmunologyComputational BiologyBiological Sciences not elsewhere classifiedChemical Sciences not elsewhere classifiedInformation Systems not elsewhere classifiedsuperior binding propertiesreweighted data augmentationoverall distribution shapeoptimizing molecular generationmolecular discovery pipelinesmachine learning techniquesimproved binding affinitieshigher binding affinitiesgenerative adversarial networksexplore molecular spacesrelated scaling hyperparameterenhanced protein bindingrational molecular designbased molecule design>/ τ ),enhanced gan ),gan ),training basedhyperparameter τenhanced methodenhanced ganu </target proteinsreweighting processreinforced gansreasonable rangepotentially increasingparameter dependenciesmethod offersessential toolsdocking algorithmscomprehensive investigationbind stronglyIdentifying small molecules that bind strongly to target proteins in rational molecular design is crucial. Machine learning techniques, such as generative adversarial networks (GAN), are now essential tools for generating such molecules. In this study, we present an enhanced method for molecule generation using objective-reinforced GANs. Specifically, we introduce BEGAN (Boltzmann-enhanced GAN), a novel approach that adjusts molecule occurrence frequencies during training based on the Boltzmann distribution exp­(−Δ<i>U</i>/τ), where Δ<i>U</i> represents the estimated binding free energy derived from docking algorithms and τ is a temperature-related scaling hyperparameter. This Boltzmann reweighting process shifts the generation process toward molecules with higher binding affinities, allowing the GAN to explore molecular spaces with superior binding properties. The reweighting process can also be refined through multiple iterations without altering the overall distribution shape. To validate our approach, we apply it to the design of sex pheromone analogs targeting Spodoptera frugiperda pheromone receptor SfruOR16, illustrating that the Boltzmann reweighting significantly increases the likelihood of generating promising sex pheromone analogs with improved binding affinities to SfruOR16, further supported by atomistic molecular dynamics simulations. Furthermore, we conduct a comprehensive investigation into parameter dependencies and propose a reasonable range for the hyperparameter τ. Our method offers a promising approach for optimizing molecular generation for enhanced protein binding, potentially increasing the efficiency of molecular discovery pipelines.2024-11-21T16:03:45ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.1021/acs.jpcb.4c06729.s004https://figshare.com/articles/dataset/BEGAN_Boltzmann-Reweighted_Data_Augmentation_for_Enhanced_GAN-Based_Molecule_Design_in_Insect_Pheromone_Receptors/27728222CC BY-NC 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/277282222024-11-21T16:03:45Z
spellingShingle BEGAN: Boltzmann-Reweighted Data Augmentation for Enhanced GAN-Based Molecule Design in Insect Pheromone Receptors
Jialei Dai (13020462)
Biophysics
Biochemistry
Genetics
Molecular Biology
Neuroscience
Pharmacology
Immunology
Computational Biology
Biological Sciences not elsewhere classified
Chemical Sciences not elsewhere classified
Information Systems not elsewhere classified
superior binding properties
reweighted data augmentation
overall distribution shape
optimizing molecular generation
molecular discovery pipelines
machine learning techniques
improved binding affinities
higher binding affinities
generative adversarial networks
explore molecular spaces
related scaling hyperparameter
enhanced protein binding
rational molecular design
based molecule design
>/ τ ),
enhanced gan ),
gan ),
training based
hyperparameter τ
enhanced method
enhanced gan
u </
target proteins
reweighting process
reinforced gans
reasonable range
potentially increasing
parameter dependencies
method offers
essential tools
docking algorithms
comprehensive investigation
bind strongly
status_str publishedVersion
title BEGAN: Boltzmann-Reweighted Data Augmentation for Enhanced GAN-Based Molecule Design in Insect Pheromone Receptors
title_full BEGAN: Boltzmann-Reweighted Data Augmentation for Enhanced GAN-Based Molecule Design in Insect Pheromone Receptors
title_fullStr BEGAN: Boltzmann-Reweighted Data Augmentation for Enhanced GAN-Based Molecule Design in Insect Pheromone Receptors
title_full_unstemmed BEGAN: Boltzmann-Reweighted Data Augmentation for Enhanced GAN-Based Molecule Design in Insect Pheromone Receptors
title_short BEGAN: Boltzmann-Reweighted Data Augmentation for Enhanced GAN-Based Molecule Design in Insect Pheromone Receptors
title_sort BEGAN: Boltzmann-Reweighted Data Augmentation for Enhanced GAN-Based Molecule Design in Insect Pheromone Receptors
topic Biophysics
Biochemistry
Genetics
Molecular Biology
Neuroscience
Pharmacology
Immunology
Computational Biology
Biological Sciences not elsewhere classified
Chemical Sciences not elsewhere classified
Information Systems not elsewhere classified
superior binding properties
reweighted data augmentation
overall distribution shape
optimizing molecular generation
molecular discovery pipelines
machine learning techniques
improved binding affinities
higher binding affinities
generative adversarial networks
explore molecular spaces
related scaling hyperparameter
enhanced protein binding
rational molecular design
based molecule design
>/ τ ),
enhanced gan ),
gan ),
training based
hyperparameter τ
enhanced method
enhanced gan
u </
target proteins
reweighting process
reinforced gans
reasonable range
potentially increasing
parameter dependencies
method offers
essential tools
docking algorithms
comprehensive investigation
bind strongly