Issue-Commit Traceability Datasets
<p dir="ltr">Traceability dataset that consists of issue commit pairs</p><p>This dataset contains a collection of issue–commit pairs from software projects, annotated for the presence or absence of traceability links. These links indicate whether a specific commit is asso...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| مؤلفون آخرون: | , |
| منشور في: |
2025
|
| الموضوعات: | |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1852017427879034880 |
|---|---|
| author | Hanun Puspa (21525737) |
| author2 | Adhatus Solichah Ahmadiyah (14138976) Rizky Januar Akbar (22003280) |
| author2_role | author author |
| author_facet | Hanun Puspa (21525737) Adhatus Solichah Ahmadiyah (14138976) Rizky Januar Akbar (22003280) |
| author_role | author |
| dc.creator.none.fl_str_mv | Hanun Puspa (21525737) Adhatus Solichah Ahmadiyah (14138976) Rizky Januar Akbar (22003280) |
| dc.date.none.fl_str_mv | 2025-08-20T23:01:53Z |
| dc.identifier.none.fl_str_mv | 10.6084/m9.figshare.29293868.v2 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/dataset/issue_commit_traceability_csv/29293868 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Software engineering not elsewhere classified traceability issue commit |
| dc.title.none.fl_str_mv | Issue-Commit Traceability Datasets |
| dc.type.none.fl_str_mv | Dataset info:eu-repo/semantics/publishedVersion dataset |
| description | <p dir="ltr">Traceability dataset that consists of issue commit pairs</p><p>This dataset contains a collection of issue–commit pairs from software projects, annotated for the presence or absence of traceability links. These links indicate whether a specific commit is associated with resolving a particular issue, as commonly tracked in systems such as GitHub, Jira, or Bugzilla. The dataset combines samples from three sources:</p> <ul> <li>LinkFormer Dataset (<a href="https://zenodo.org/records/6524460" target="_blank"><u>LinkFormer: Automatic Contextualised Link Recovery of Software Artifacts in a Cross-Project Setting</u></a>)<br> </li> <li>20-MAD Dataset (<a href="https://osf.io/kvxr4/" target="_blank"><u>OSF | 20-MAD: Mozilla Apache Dataset</u></a>)<br> </li> <li>Carikado Dataset (a manually curated small-scale dataset from the author's past projects)</li> </ul> <p>Each data entry includes metadata and textual features extracted from:</p> <ul> <li>Issues: summary, description, type, status, creation date</li> <li>Commits: commit message, diff, file names, authoring date</li> </ul> <p>Labels are binary:</p> <ul> <li>1 indicates that a traceability link exists between the issue and the commit.</li> <li>0 indicates no such link.</li> </ul> <p>This dataset was used to fine-tune several pretrained language models (e.g., BERT, RoBERTa) for binary classification and was further analyzed using Explainable AI techniques (LIME and SHAP) to interpret feature importance.</p> |
| eu_rights_str_mv | openAccess |
| id | Manara_b2cd742f6c1dfd2452cec19da41f97db |
| identifier_str_mv | 10.6084/m9.figshare.29293868.v2 |
| network_acronym_str | Manara |
| network_name_str | ManaraRepo |
| oai_identifier_str | oai:figshare.com:article/29293868 |
| publishDate | 2025 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Issue-Commit Traceability DatasetsHanun Puspa (21525737)Adhatus Solichah Ahmadiyah (14138976)Rizky Januar Akbar (22003280)Software engineering not elsewhere classifiedtraceabilityissuecommit<p dir="ltr">Traceability dataset that consists of issue commit pairs</p><p>This dataset contains a collection of issue–commit pairs from software projects, annotated for the presence or absence of traceability links. These links indicate whether a specific commit is associated with resolving a particular issue, as commonly tracked in systems such as GitHub, Jira, or Bugzilla. The dataset combines samples from three sources:</p> <ul> <li>LinkFormer Dataset (<a href="https://zenodo.org/records/6524460" target="_blank"><u>LinkFormer: Automatic Contextualised Link Recovery of Software Artifacts in a Cross-Project Setting</u></a>)<br> </li> <li>20-MAD Dataset (<a href="https://osf.io/kvxr4/" target="_blank"><u>OSF | 20-MAD: Mozilla Apache Dataset</u></a>)<br> </li> <li>Carikado Dataset (a manually curated small-scale dataset from the author's past projects)</li> </ul> <p>Each data entry includes metadata and textual features extracted from:</p> <ul> <li>Issues: summary, description, type, status, creation date</li> <li>Commits: commit message, diff, file names, authoring date</li> </ul> <p>Labels are binary:</p> <ul> <li>1 indicates that a traceability link exists between the issue and the commit.</li> <li>0 indicates no such link.</li> </ul> <p>This dataset was used to fine-tune several pretrained language models (e.g., BERT, RoBERTa) for binary classification and was further analyzed using Explainable AI techniques (LIME and SHAP) to interpret feature importance.</p>2025-08-20T23:01:53ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.6084/m9.figshare.29293868.v2https://figshare.com/articles/dataset/issue_commit_traceability_csv/29293868CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/292938682025-08-20T23:01:53Z |
| spellingShingle | Issue-Commit Traceability Datasets Hanun Puspa (21525737) Software engineering not elsewhere classified traceability issue commit |
| status_str | publishedVersion |
| title | Issue-Commit Traceability Datasets |
| title_full | Issue-Commit Traceability Datasets |
| title_fullStr | Issue-Commit Traceability Datasets |
| title_full_unstemmed | Issue-Commit Traceability Datasets |
| title_short | Issue-Commit Traceability Datasets |
| title_sort | Issue-Commit Traceability Datasets |
| topic | Software engineering not elsewhere classified traceability issue commit |