Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models

<p dir="ltr">Determining the ideal architecture for deep learning models, such as the number of layers and neurons, is a difficult and resource-intensive process that frequently relies on human tuning or computationally costly optimization approaches. While Particle Swarm Optimizatio...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Saad Hameed (6488738) (author)
مؤلفون آخرون: Basheer Qolomany (16855527) (author), Samir Brahim Belhaouari (9427347) (author), Mohamed Abdallah (3073191) (author), Junaid Qadir (16494902) (author), Ala Al-Fuqaha (4434340) (author)
منشور في: 2025
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513534158700544
author Saad Hameed (6488738)
author2 Basheer Qolomany (16855527)
Samir Brahim Belhaouari (9427347)
Mohamed Abdallah (3073191)
Junaid Qadir (16494902)
Ala Al-Fuqaha (4434340)
author2_role author
author
author
author
author
author_facet Saad Hameed (6488738)
Basheer Qolomany (16855527)
Samir Brahim Belhaouari (9427347)
Mohamed Abdallah (3073191)
Junaid Qadir (16494902)
Ala Al-Fuqaha (4434340)
author_role author
dc.creator.none.fl_str_mv Saad Hameed (6488738)
Basheer Qolomany (16855527)
Samir Brahim Belhaouari (9427347)
Mohamed Abdallah (3073191)
Junaid Qadir (16494902)
Ala Al-Fuqaha (4434340)
dc.date.none.fl_str_mv 2025-05-12T12:00:00Z
dc.identifier.none.fl_str_mv 10.1109/ojcs.2025.3564493
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Large_Language_Model_Enhanced_Particle_Swarm_Optimization_for_Hyperparameter_Tuning_for_Deep_Learning_Models/30406186
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information and computing sciences
Artificial intelligence
Machine learning
Mathematical sciences
Applied mathematics
Deep learning optimization
PSO
LLM
machine learning
hyper-parameter optimization
Computational modeling
Tuning
Neurons
Deep learning
Convergence
Large language models
Computational efficiency
Accuracy
Predictive models
Particle swarm optimization
dc.title.none.fl_str_mv Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">Determining the ideal architecture for deep learning models, such as the number of layers and neurons, is a difficult and resource-intensive process that frequently relies on human tuning or computationally costly optimization approaches. While Particle Swarm Optimization (PSO) and Large Language Models (LLMs) have been individually applied in optimization and deep learning, their combined use for enhancing convergence in numerical optimization tasks remains underexplored. Our work addresses this gap by integrating LLMs into PSO to reduce model evaluations and improve convergence for deep learning hyperparameter tuning. The proposed LLM-enhanced PSO method addresses the difficulties of efficiency and convergence by using LLMs (particularly ChatGPT-3.5 and Llama3) to improve PSO performance, allowing for faster achievement of target objectives. Our method speeds up search space exploration by substituting underperforming particle placements with best suggestions offered by LLMs. Comprehensive experiments across three scenarios—(1) optimizing the Rastrigin function, (2) using Long Short-Term Memory (LSTM) networks for time series regression, and (3) using Convolutional Neural Networks (CNNs) for material classification—show that the method significantly improves convergence rates and lowers computational costs. Depending on the application, computational complexity is lowered by 20% to 60% compared to traditional PSO methods. Llama3 achieved a 20% to 40% reduction in model calls for regression tasks, whereas ChatGPT-3.5 reduced model calls by 60% for both regression and classification tasks, all while preserving accuracy and error rates. This groundbreaking methodology offers a very efficient and effective solution for optimizing deep learning models, leading to substantial computational performance improvements across a wide range of applications.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Open Journal of the Computer Society<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/ojcs.2025.3564493" target="_blank">https://dx.doi.org/10.1109/ojcs.2025.3564493</a></p>
eu_rights_str_mv openAccess
id Manara2_cb059abe39831ada5e80b01567d0ec30
identifier_str_mv 10.1109/ojcs.2025.3564493
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/30406186
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning ModelsSaad Hameed (6488738)Basheer Qolomany (16855527)Samir Brahim Belhaouari (9427347)Mohamed Abdallah (3073191)Junaid Qadir (16494902)Ala Al-Fuqaha (4434340)Information and computing sciencesArtificial intelligenceMachine learningMathematical sciencesApplied mathematicsDeep learning optimizationPSOLLMmachine learninghyper-parameter optimizationComputational modelingTuningNeuronsDeep learningConvergenceLarge language modelsComputational efficiencyAccuracyPredictive modelsParticle swarm optimization<p dir="ltr">Determining the ideal architecture for deep learning models, such as the number of layers and neurons, is a difficult and resource-intensive process that frequently relies on human tuning or computationally costly optimization approaches. While Particle Swarm Optimization (PSO) and Large Language Models (LLMs) have been individually applied in optimization and deep learning, their combined use for enhancing convergence in numerical optimization tasks remains underexplored. Our work addresses this gap by integrating LLMs into PSO to reduce model evaluations and improve convergence for deep learning hyperparameter tuning. The proposed LLM-enhanced PSO method addresses the difficulties of efficiency and convergence by using LLMs (particularly ChatGPT-3.5 and Llama3) to improve PSO performance, allowing for faster achievement of target objectives. Our method speeds up search space exploration by substituting underperforming particle placements with best suggestions offered by LLMs. Comprehensive experiments across three scenarios—(1) optimizing the Rastrigin function, (2) using Long Short-Term Memory (LSTM) networks for time series regression, and (3) using Convolutional Neural Networks (CNNs) for material classification—show that the method significantly improves convergence rates and lowers computational costs. Depending on the application, computational complexity is lowered by 20% to 60% compared to traditional PSO methods. Llama3 achieved a 20% to 40% reduction in model calls for regression tasks, whereas ChatGPT-3.5 reduced model calls by 60% for both regression and classification tasks, all while preserving accuracy and error rates. This groundbreaking methodology offers a very efficient and effective solution for optimizing deep learning models, leading to substantial computational performance improvements across a wide range of applications.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Open Journal of the Computer Society<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/ojcs.2025.3564493" target="_blank">https://dx.doi.org/10.1109/ojcs.2025.3564493</a></p>2025-05-12T12:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/ojcs.2025.3564493https://figshare.com/articles/journal_contribution/Large_Language_Model_Enhanced_Particle_Swarm_Optimization_for_Hyperparameter_Tuning_for_Deep_Learning_Models/30406186CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/304061862025-05-12T12:00:00Z
spellingShingle Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
Saad Hameed (6488738)
Information and computing sciences
Artificial intelligence
Machine learning
Mathematical sciences
Applied mathematics
Deep learning optimization
PSO
LLM
machine learning
hyper-parameter optimization
Computational modeling
Tuning
Neurons
Deep learning
Convergence
Large language models
Computational efficiency
Accuracy
Predictive models
Particle swarm optimization
status_str publishedVersion
title Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_full Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_fullStr Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_full_unstemmed Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_short Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_sort Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
topic Information and computing sciences
Artificial intelligence
Machine learning
Mathematical sciences
Applied mathematics
Deep learning optimization
PSO
LLM
machine learning
hyper-parameter optimization
Computational modeling
Tuning
Neurons
Deep learning
Convergence
Large language models
Computational efficiency
Accuracy
Predictive models
Particle swarm optimization