Large language models for code completion: A systematic literature review

<p>Code completion serves as a fundamental aspect of modern software development, improving developers' coding processes. Integrating code completion tools into an Integrated Development Environment (IDE) or code editor enhances the coding process and boosts productivity by reducing error...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Rasha Ahmad Husein (19744756) (author)
مؤلفون آخرون:	Hala Aburajouh (19744759) (author), Cagatay Catal (6897842) (author)
منشور في:	2024
الموضوعات:	Information and computing sciences Artificial intelligence Data management and data science Software engineering Code completion Large language models Deep learning Transformers
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

_version_	1864513555997392896
author	Rasha Ahmad Husein (19744756)
author2	Hala Aburajouh (19744759) Cagatay Catal (6897842)
author2_role	author author
author_facet	Rasha Ahmad Husein (19744756) Hala Aburajouh (19744759) Cagatay Catal (6897842)
author_role	author
dc.creator.none.fl_str_mv	Rasha Ahmad Husein (19744756) Hala Aburajouh (19744759) Cagatay Catal (6897842)
dc.date.none.fl_str_mv	2024-08-26T21:00:00Z
dc.identifier.none.fl_str_mv	10.1016/j.csi.2024.103917
dc.relation.none.fl_str_mv	https://figshare.com/articles/journal_contribution/Large_language_models_for_code_completion_A_systematic_literature_review/27109912
dc.rights.none.fl_str_mv	CC BY 4.0 info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv	Information and computing sciences Artificial intelligence Data management and data science Software engineering Code completion Large language models Deep learning Transformers
dc.title.none.fl_str_mv	Large language models for code completion: A systematic literature review
dc.type.none.fl_str_mv	Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal
description	<p>Code completion serves as a fundamental aspect of modern software development, improving developers' coding processes. Integrating code completion tools into an Integrated Development Environment (IDE) or code editor enhances the coding process and boosts productivity by reducing errors and speeding up code writing while reducing cognitive load. This is achieved by predicting subsequent tokens, such as keywords, variable names, types, function names, operators, and more. Different techniques can achieve code completion, and recent research has focused on Deep Learning methods, particularly Large Language Models (LLMs) utilizing Transformer algorithms. While several research papers have focused on the use of LLMs for code completion, these studies are fragmented, and there is no systematic overview of the use of LLMs for code completion. Therefore, we aimed to perform a Systematic Literature Review (SLR) study to investigate how LLMs have been applied for code completion so far. We have formulated several research questions to address how LLMs have been integrated for code completion-related tasks and to assess the efficacy of these LLMs in the context of code completion. To achieve this, we retrieved 244 papers from scientific databases using auto-search and specific keywords, finally selecting 23 primary studies based on an SLR methodology for in-depth analysis. This SLR study categorizes the granularity levels of code completion achieved by utilizing LLMs in IDEs, explores the existing issues in current code completion systems, how LLMs address these challenges, and the pre-training and fine-tuning methods employed. Additionally, this study identifies open research problems and outlines future research directions. Our analysis reveals that LLMs significantly enhance code completion performance across several programming languages and contexts, and their capability to predict relevant code snippets based on context and partial input boosts developer productivity substantially.</p><h2>Other Information</h2> <p> Published in: Computer Standards & Interfaces<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.csi.2024.103917" target="_blank">https://dx.doi.org/10.1016/j.csi.2024.103917</a></p>
eu_rights_str_mv	openAccess
id	Manara2_ec56df288664d133551140d0bc8b0873
identifier_str_mv	10.1016/j.csi.2024.103917
network_acronym_str	Manara2
network_name_str	Manara2
oai_identifier_str	oai:figshare.com:article/27109912
publishDate	2024
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv	CC BY 4.0
spelling	Large language models for code completion: A systematic literature reviewRasha Ahmad Husein (19744756)Hala Aburajouh (19744759)Cagatay Catal (6897842)Information and computing sciencesArtificial intelligenceData management and data scienceSoftware engineeringCode completionLarge language modelsDeep learningTransformers<p>Code completion serves as a fundamental aspect of modern software development, improving developers' coding processes. Integrating code completion tools into an Integrated Development Environment (IDE) or code editor enhances the coding process and boosts productivity by reducing errors and speeding up code writing while reducing cognitive load. This is achieved by predicting subsequent tokens, such as keywords, variable names, types, function names, operators, and more. Different techniques can achieve code completion, and recent research has focused on Deep Learning methods, particularly Large Language Models (LLMs) utilizing Transformer algorithms. While several research papers have focused on the use of LLMs for code completion, these studies are fragmented, and there is no systematic overview of the use of LLMs for code completion. Therefore, we aimed to perform a Systematic Literature Review (SLR) study to investigate how LLMs have been applied for code completion so far. We have formulated several research questions to address how LLMs have been integrated for code completion-related tasks and to assess the efficacy of these LLMs in the context of code completion. To achieve this, we retrieved 244 papers from scientific databases using auto-search and specific keywords, finally selecting 23 primary studies based on an SLR methodology for in-depth analysis. This SLR study categorizes the granularity levels of code completion achieved by utilizing LLMs in IDEs, explores the existing issues in current code completion systems, how LLMs address these challenges, and the pre-training and fine-tuning methods employed. Additionally, this study identifies open research problems and outlines future research directions. Our analysis reveals that LLMs significantly enhance code completion performance across several programming languages and contexts, and their capability to predict relevant code snippets based on context and partial input boosts developer productivity substantially.</p><h2>Other Information</h2> <p> Published in: Computer Standards & Interfaces<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.csi.2024.103917" target="_blank">https://dx.doi.org/10.1016/j.csi.2024.103917</a></p>2024-08-26T21:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1016/j.csi.2024.103917https://figshare.com/articles/journal_contribution/Large_language_models_for_code_completion_A_systematic_literature_review/27109912CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/271099122024-08-26T21:00:00Z
spellingShingle	Large language models for code completion: A systematic literature review Rasha Ahmad Husein (19744756) Information and computing sciences Artificial intelligence Data management and data science Software engineering Code completion Large language models Deep learning Transformers
status_str	publishedVersion
title	Large language models for code completion: A systematic literature review
title_full	Large language models for code completion: A systematic literature review
title_fullStr	Large language models for code completion: A systematic literature review
title_full_unstemmed	Large language models for code completion: A systematic literature review
title_short	Large language models for code completion: A systematic literature review
title_sort	Large language models for code completion: A systematic literature review
topic	Information and computing sciences Artificial intelligence Data management and data science Software engineering Code completion Large language models Deep learning Transformers

Large language models for code completion: A systematic literature review

مواد مشابهة