FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test Code Repair

FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test Code Repair

<p dir="ltr">This is the replication package associated with the paper: 'FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test for Code Repair'</p><p><br></p><p dir="ltr">### Requirements</p><...

Full description

Saved in:

Bibliographic Details
Main Author:	Sakina Fatima (15362704) (author)
Published:	2025
Subjects:	Deep learning Automated software engineering Software testing, verification and validation Flaky Tests pre-trained language models Black-box testing Few Shot Learning automated labeled corpus GPT code generation
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	<p dir="ltr">This is the replication package associated with the paper: 'FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test for Code Repair'</p><p><br></p><p dir="ltr">### Requirements</p><p dir="ltr">This is a list of all required python packages:</p><p dir="ltr">-imbalanced_learn==0.8.1</p><p dir="ltr">-numpy==1.19.5</p><p dir="ltr">-pandas==1.3.3</p><p dir="ltr">-transformers==4.10.2</p><p dir="ltr">-torch==1.5.0</p><p dir="ltr">-scikit_learn==0.24.2</p><p dir="ltr">-openai==v0.28.1</p><p><br></p><p dir="ltr">#Automated tool for labelling dataset with flaky test fix categories</p><p><br></p><p dir="ltr">This is a step-by-step guideline for automatically labelling dataset with flaky test fix categories</p><p><br></p><p><br></p><p dir="ltr">### Input Files:</p><p dir="ltr">This is a an input file that is required to accomplish this step:</p><p dir="ltr">* Data/IdoFT_dataset_filtered.csv</p><p dir="ltr">https://figshare.com/s/47f0fb6207ac3f9e2351</p><p><br></p><p dir="ltr">### Output Files:</p><p dir="ltr">* Results/IdoFT_dataset_filtered.csv</p><p><br></p><p><br></p><p dir="ltr">### Replicating the experiment</p><p><br></p><p dir="ltr">This experiment can be executed using the following commands after navigating to the `Code\` folder:</p><p><br></p><p dir="ltr">```console</p><p dir="ltr">bash Automated_labelling_tool.sh</p><p>```</p><p><br></p><p dir="ltr">It will generate the dataset required to run our prediction models to predict the category of the fix, given a flaky test code</p><p><br></p><p>---</p><p><br></p><p dir="ltr"># Prediction models for fix categories using the test case code</p><p><br></p><p dir="ltr">This is the guideline for replicating the experiments we used to evaluate our prediction models i.e. CodeBERT and UniXcoder (fine-tuned with Few Shot Learning and Feed Forward Neural Network independently) for classifying tests with different fix categories. </p><p><br></p><p dir="ltr">### Input Files:</p><p dir="ltr">This is a list of input files that are required to perform the binary classification for each fix category:</p><p dir="ltr">* Data/change_assertion.csv</p><p dir="ltr">Data/change_condition.csv</p><p dir="ltr">Data/change_data_format.csv</p><p dir="ltr">Data/change_data_structure.csv</p><p dir="ltr">Data/handle_exception.csv</p><p dir="ltr">Data/reorder_data.csv</p><p dir="ltr">Data/reset_variable.csv</p><p dir="ltr">Data/call_static_method.csv</p><p dir="ltr">Data/reorder_parameters.csv</p><p dir="ltr">Data/misc.csv</p><p><br></p><p dir="ltr">### Replicating the experiment</p><p><br></p><p dir="ltr">To run experiment with our prediction model, navigate to the `Code\` folder and run the following commands:</p><p><br></p><p dir="ltr">```console</p><p dir="ltr">bash codemodel_with_fnn.sh</p><p dir="ltr">bash codemodel_with_fsl.sh</p><p>```</p><p><br></p><p dir="ltr"># Generate the repaired flaky tests using GPT 3.5 Turbo:</p><p><br></p><p dir="ltr">Input Files:</p><p dir="ltr">This is a an input file that is required to accomplish this step:</p><p><br></p><p dir="ltr">### Experiments on the 181 tests using prompts with and without labels:</p><p dir="ltr"> Data/Dataset_for_GPT.csv</p><p><br></p><p><br></p><p dir="ltr">### Experiments on the tests using prompts with, without labels and in-context learning:</p><p><br></p><p dir="ltr">###For Change Assertion:</p><p dir="ltr">* Data/change_assertion_input_FSP.csv</p><p dir="ltr">###For Change Condition:</p><p dir="ltr">* Data/change_condition_input_FSP.csv</p><p dir="ltr">###For Change DataStructure:</p><p dir="ltr">* Data/change_dataStructure_input_FSP.csv</p><p><br></p><p dir="ltr">To run this experiment, navigate to the `Code\` folder and run the following commands:</p><p><br></p><p dir="ltr">```console</p><p dir="ltr">bash gpt3.5_experiments.sh</p><p>```</p><p dir="ltr"># Execute a sample of GPT-reapired flaky tests:</p><p><br></p><p dir="ltr">Input Files:</p><p dir="ltr">This is a an input file that is required to accomplish this step:</p><p dir="ltr">* Data/sampleTests_For_Execution.csv</p><p><br></p><p dir="ltr">To execute the 35 GPT-repaired flaky tests:</p><p><br></p><p dir="ltr">-First Clone the Github project (From the 'PR Link' column in the sampleTests_For_Execution.csv file)</p><p dir="ltr">-Checkout on the the commit of the given PR link (if merged, checkout on the master branch)</p><p dir="ltr">-Navigate to the project folder and run command: </p><p dir="ltr">```console</p><p dir="ltr">*mvn clean test -Dtest=[Test Class Name]#[Test Method Name] -DfailIfNoTests=false</p><p>```</p>

Cannot write session to /tmp/vufind_sessions/sess_j67840k2h5m2gjd88jhfpnb956