Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models

<p dir="ltr">Determining the ideal architecture for deep learning models, such as the number of layers and neurons, is a difficult and resource-intensive process that frequently relies on human tuning or computationally costly optimization approaches. While Particle Swarm Optimizatio...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Saad Hameed (6488738) (author)
مؤلفون آخرون:	Basheer Qolomany (16855527) (author), Samir Brahim Belhaouari (9427347) (author), Mohamed Abdallah (3073191) (author), Junaid Qadir (16494902) (author), Ala Al-Fuqaha (4434340) (author)
منشور في:	2025
الموضوعات:	Information and computing sciences Artificial intelligence Machine learning Mathematical sciences Applied mathematics Deep learning optimization PSO LLM machine learning hyper-parameter optimization Computational modeling Tuning Neurons Deep learning Convergence Large language models Computational efficiency Accuracy Predictive models Particle swarm optimization
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

_version_	1864513534158700544
author	Saad Hameed (6488738)
author2	Basheer Qolomany (16855527) Samir Brahim Belhaouari (9427347) Mohamed Abdallah (3073191) Junaid Qadir (16494902) Ala Al-Fuqaha (4434340)
author2_role	author author author author author
author_facet	Saad Hameed (6488738) Basheer Qolomany (16855527) Samir Brahim Belhaouari (9427347) Mohamed Abdallah (3073191) Junaid Qadir (16494902) Ala Al-Fuqaha (4434340)
author_role	author
dc.creator.none.fl_str_mv	Saad Hameed (6488738) Basheer Qolomany (16855527) Samir Brahim Belhaouari (9427347) Mohamed Abdallah (3073191) Junaid Qadir (16494902) Ala Al-Fuqaha (4434340)
dc.date.none.fl_str_mv	2025-05-12T12:00:00Z
dc.identifier.none.fl_str_mv	10.1109/ojcs.2025.3564493
dc.relation.none.fl_str_mv	https://figshare.com/articles/journal_contribution/Large_Language_Model_Enhanced_Particle_Swarm_Optimization_for_Hyperparameter_Tuning_for_Deep_Learning_Models/30406186
dc.rights.none.fl_str_mv	CC BY 4.0 info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv	Information and computing sciences Artificial intelligence Machine learning Mathematical sciences Applied mathematics Deep learning optimization PSO LLM machine learning hyper-parameter optimization Computational modeling Tuning Neurons Deep learning Convergence Large language models Computational efficiency Accuracy Predictive models Particle swarm optimization
dc.title.none.fl_str_mv	Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
dc.type.none.fl_str_mv	Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal
description	<p dir="ltr">Determining the ideal architecture for deep learning models, such as the number of layers and neurons, is a difficult and resource-intensive process that frequently relies on human tuning or computationally costly optimization approaches. While Particle Swarm Optimization (PSO) and Large Language Models (LLMs) have been individually applied in optimization and deep learning, their combined use for enhancing convergence in numerical optimization tasks remains underexplored. Our work addresses this gap by integrating LLMs into PSO to reduce model evaluations and improve convergence for deep learning hyperparameter tuning. The proposed LLM-enhanced PSO method addresses the difficulties of efficiency and convergence by using LLMs (particularly ChatGPT-3.5 and Llama3) to improve PSO performance, allowing for faster achievement of target objectives. Our method speeds up search space exploration by substituting underperforming particle placements with best suggestions offered by LLMs. Comprehensive experiments across three scenarios—(1) optimizing the Rastrigin function, (2) using Long Short-Term Memory (LSTM) networks for time series regression, and (3) using Convolutional Neural Networks (CNNs) for material classification—show that the method significantly improves convergence rates and lowers computational costs. Depending on the application, computational complexity is lowered by 20% to 60% compared to traditional PSO methods. Llama3 achieved a 20% to 40% reduction in model calls for regression tasks, whereas ChatGPT-3.5 reduced model calls by 60% for both regression and classification tasks, all while preserving accuracy and error rates. This groundbreaking methodology offers a very efficient and effective solution for optimizing deep learning models, leading to substantial computational performance improvements across a wide range of applications.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Open Journal of the Computer Society<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/ojcs.2025.3564493" target="_blank">https://dx.doi.org/10.1109/ojcs.2025.3564493</a></p>
eu_rights_str_mv	openAccess
id	Manara2_cb059abe39831ada5e80b01567d0ec30
identifier_str_mv	10.1109/ojcs.2025.3564493
network_acronym_str	Manara2
network_name_str	Manara2
oai_identifier_str	oai:figshare.com:article/30406186
publishDate	2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv	CC BY 4.0
spelling	Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning ModelsSaad Hameed (6488738)Basheer Qolomany (16855527)Samir Brahim Belhaouari (9427347)Mohamed Abdallah (3073191)Junaid Qadir (16494902)Ala Al-Fuqaha (4434340)Information and computing sciencesArtificial intelligenceMachine learningMathematical sciencesApplied mathematicsDeep learning optimizationPSOLLMmachine learninghyper-parameter optimizationComputational modelingTuningNeuronsDeep learningConvergenceLarge language modelsComputational efficiencyAccuracyPredictive modelsParticle swarm optimization<p dir="ltr">Determining the ideal architecture for deep learning models, such as the number of layers and neurons, is a difficult and resource-intensive process that frequently relies on human tuning or computationally costly optimization approaches. While Particle Swarm Optimization (PSO) and Large Language Models (LLMs) have been individually applied in optimization and deep learning, their combined use for enhancing convergence in numerical optimization tasks remains underexplored. Our work addresses this gap by integrating LLMs into PSO to reduce model evaluations and improve convergence for deep learning hyperparameter tuning. The proposed LLM-enhanced PSO method addresses the difficulties of efficiency and convergence by using LLMs (particularly ChatGPT-3.5 and Llama3) to improve PSO performance, allowing for faster achievement of target objectives. Our method speeds up search space exploration by substituting underperforming particle placements with best suggestions offered by LLMs. Comprehensive experiments across three scenarios—(1) optimizing the Rastrigin function, (2) using Long Short-Term Memory (LSTM) networks for time series regression, and (3) using Convolutional Neural Networks (CNNs) for material classification—show that the method significantly improves convergence rates and lowers computational costs. Depending on the application, computational complexity is lowered by 20% to 60% compared to traditional PSO methods. Llama3 achieved a 20% to 40% reduction in model calls for regression tasks, whereas ChatGPT-3.5 reduced model calls by 60% for both regression and classification tasks, all while preserving accuracy and error rates. This groundbreaking methodology offers a very efficient and effective solution for optimizing deep learning models, leading to substantial computational performance improvements across a wide range of applications.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Open Journal of the Computer Society<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/ojcs.2025.3564493" target="_blank">https://dx.doi.org/10.1109/ojcs.2025.3564493</a></p>2025-05-12T12:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/ojcs.2025.3564493https://figshare.com/articles/journal_contribution/Large_Language_Model_Enhanced_Particle_Swarm_Optimization_for_Hyperparameter_Tuning_for_Deep_Learning_Models/30406186CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/304061862025-05-12T12:00:00Z
spellingShingle	Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models Saad Hameed (6488738) Information and computing sciences Artificial intelligence Machine learning Mathematical sciences Applied mathematics Deep learning optimization PSO LLM machine learning hyper-parameter optimization Computational modeling Tuning Neurons Deep learning Convergence Large language models Computational efficiency Accuracy Predictive models Particle swarm optimization
status_str	publishedVersion
title	Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_full	Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_fullStr	Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_full_unstemmed	Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_short	Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
title_sort	Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models
topic	Information and computing sciences Artificial intelligence Machine learning Mathematical sciences Applied mathematics Deep learning optimization PSO LLM machine learning hyper-parameter optimization Computational modeling Tuning Neurons Deep learning Convergence Large language models Computational efficiency Accuracy Predictive models Particle swarm optimization

Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models

مواد مشابهة