Issue-Commit Traceability Datasets

<p dir="ltr">Traceability dataset that consists of issue commit pairs</p><p>This dataset contains a collection of issue–commit pairs from software projects, annotated for the presence or absence of traceability links. These links indicate whether a specific commit is asso...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Hanun Puspa (21525737) (author)
مؤلفون آخرون: Adhatus Solichah Ahmadiyah (14138976) (author), Rizky Januar Akbar (22003280) (author)
منشور في: 2025
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1852017427879034880
author Hanun Puspa (21525737)
author2 Adhatus Solichah Ahmadiyah (14138976)
Rizky Januar Akbar (22003280)
author2_role author
author
author_facet Hanun Puspa (21525737)
Adhatus Solichah Ahmadiyah (14138976)
Rizky Januar Akbar (22003280)
author_role author
dc.creator.none.fl_str_mv Hanun Puspa (21525737)
Adhatus Solichah Ahmadiyah (14138976)
Rizky Januar Akbar (22003280)
dc.date.none.fl_str_mv 2025-08-20T23:01:53Z
dc.identifier.none.fl_str_mv 10.6084/m9.figshare.29293868.v2
dc.relation.none.fl_str_mv https://figshare.com/articles/dataset/issue_commit_traceability_csv/29293868
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Software engineering not elsewhere classified
traceability
issue
commit
dc.title.none.fl_str_mv Issue-Commit Traceability Datasets
dc.type.none.fl_str_mv Dataset
info:eu-repo/semantics/publishedVersion
dataset
description <p dir="ltr">Traceability dataset that consists of issue commit pairs</p><p>This dataset contains a collection of issue–commit pairs from software projects, annotated for the presence or absence of traceability links. These links indicate whether a specific commit is associated with resolving a particular issue, as commonly tracked in systems such as GitHub, Jira, or Bugzilla. The dataset combines samples from three sources:</p> <ul> <li>LinkFormer Dataset (<a href="https://zenodo.org/records/6524460" target="_blank"><u>LinkFormer: Automatic Contextualised Link Recovery of Software Artifacts in a Cross-Project Setting</u></a>)<br> </li> <li>20-MAD Dataset (<a href="https://osf.io/kvxr4/" target="_blank"><u>OSF | 20-MAD: Mozilla Apache Dataset</u></a>)<br> </li> <li>Carikado Dataset (a manually curated small-scale dataset from the author's past projects)</li> </ul> <p>Each data entry includes metadata and textual features extracted from:</p> <ul> <li>Issues: summary, description, type, status, creation date</li> <li>Commits: commit message, diff, file names, authoring date</li> </ul> <p>Labels are binary:</p> <ul> <li>1 indicates that a traceability link exists between the issue and the commit.</li> <li>0 indicates no such link.</li> </ul> <p>This dataset was used to fine-tune several pretrained language models (e.g., BERT, RoBERTa) for binary classification and was further analyzed using Explainable AI techniques (LIME and SHAP) to interpret feature importance.</p>
eu_rights_str_mv openAccess
id Manara_b2cd742f6c1dfd2452cec19da41f97db
identifier_str_mv 10.6084/m9.figshare.29293868.v2
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/29293868
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Issue-Commit Traceability DatasetsHanun Puspa (21525737)Adhatus Solichah Ahmadiyah (14138976)Rizky Januar Akbar (22003280)Software engineering not elsewhere classifiedtraceabilityissuecommit<p dir="ltr">Traceability dataset that consists of issue commit pairs</p><p>This dataset contains a collection of issue–commit pairs from software projects, annotated for the presence or absence of traceability links. These links indicate whether a specific commit is associated with resolving a particular issue, as commonly tracked in systems such as GitHub, Jira, or Bugzilla. The dataset combines samples from three sources:</p> <ul> <li>LinkFormer Dataset (<a href="https://zenodo.org/records/6524460" target="_blank"><u>LinkFormer: Automatic Contextualised Link Recovery of Software Artifacts in a Cross-Project Setting</u></a>)<br> </li> <li>20-MAD Dataset (<a href="https://osf.io/kvxr4/" target="_blank"><u>OSF | 20-MAD: Mozilla Apache Dataset</u></a>)<br> </li> <li>Carikado Dataset (a manually curated small-scale dataset from the author's past projects)</li> </ul> <p>Each data entry includes metadata and textual features extracted from:</p> <ul> <li>Issues: summary, description, type, status, creation date</li> <li>Commits: commit message, diff, file names, authoring date</li> </ul> <p>Labels are binary:</p> <ul> <li>1 indicates that a traceability link exists between the issue and the commit.</li> <li>0 indicates no such link.</li> </ul> <p>This dataset was used to fine-tune several pretrained language models (e.g., BERT, RoBERTa) for binary classification and was further analyzed using Explainable AI techniques (LIME and SHAP) to interpret feature importance.</p>2025-08-20T23:01:53ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.6084/m9.figshare.29293868.v2https://figshare.com/articles/dataset/issue_commit_traceability_csv/29293868CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/292938682025-08-20T23:01:53Z
spellingShingle Issue-Commit Traceability Datasets
Hanun Puspa (21525737)
Software engineering not elsewhere classified
traceability
issue
commit
status_str publishedVersion
title Issue-Commit Traceability Datasets
title_full Issue-Commit Traceability Datasets
title_fullStr Issue-Commit Traceability Datasets
title_full_unstemmed Issue-Commit Traceability Datasets
title_short Issue-Commit Traceability Datasets
title_sort Issue-Commit Traceability Datasets
topic Software engineering not elsewhere classified
traceability
issue
commit