Self-Supervised Learning Powered by Synthetic Data From Diffusion Models: Application to X-Ray Images
<p dir="ltr">Synthetic data offers a compelling solution to the challenges associated with acquiring high-quality medical data, which is often constrained by privacy concerns and limited accessibility. This study explores the efficacy of synthetic data generated using diffusion model...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| مؤلفون آخرون: | |
| منشور في: |
2025
|
| الموضوعات: | |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1864513534589665280 |
|---|---|
| author | Abdullah Hosseini (22466602) |
| author2 | Ahmed Serag (2945643) |
| author2_role | author |
| author_facet | Abdullah Hosseini (22466602) Ahmed Serag (2945643) |
| author_role | author |
| dc.creator.none.fl_str_mv | Abdullah Hosseini (22466602) Ahmed Serag (2945643) |
| dc.date.none.fl_str_mv | 2025-04-09T06:00:00Z |
| dc.identifier.none.fl_str_mv | 10.1109/access.2025.3555619 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/Self-Supervised_Learning_Powered_by_Synthetic_Data_From_Diffusion_Models_Application_to_X-Ray_Images/30405526 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Health sciences Health services and systems Information and computing sciences Artificial intelligence Cybersecurity and privacy Machine learning Artificial intelligence biomedical imaging deep learning diffusion probabilistic self-supervised learning synthetic data Synthetic data Data models Training Biological system modeling Image segmentation X-ray imaging Diffusion processes Diffusion models Biomarkers Medical diagnostic imaging |
| dc.title.none.fl_str_mv | Self-Supervised Learning Powered by Synthetic Data From Diffusion Models: Application to X-Ray Images |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <p dir="ltr">Synthetic data offers a compelling solution to the challenges associated with acquiring high-quality medical data, which is often constrained by privacy concerns and limited accessibility. This study explores the efficacy of synthetic data generated using diffusion models for training deep learning models within a self-supervised learning framework. The primary objective is to evaluate whether synthetic data can effectively preserve critical medical biomarkers and support reliable downstream tasks such as classification and segmentation. Using chest X-ray images as a case study, the results reveal that models pretrained on synthetic data achieve performance comparable to or surpassing those pretrained on real data. Specifically, in pneumonia classification task, the model trained on synthetic data outperformed established benchmarks, achieving an Area Under the Curve of 99.1 and an F1-score of 96.1%. Similarly, for segmentation tasks, the model trained on synthetic data demonstrated robust performance, attaining a Dice score of 0.85. These findings underscore a significant advancement in the generation of synthetic medical images, providing a viable approach to creating realistic, biomarker-preserving datasets that ensure patient confidentiality and enable diverse applications in medical imaging.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2025.3555619" target="_blank">https://dx.doi.org/10.1109/access.2025.3555619</a></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_5c2996fd6334e966da5d3c4365c429bf |
| identifier_str_mv | 10.1109/access.2025.3555619 |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/30405526 |
| publishDate | 2025 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Self-Supervised Learning Powered by Synthetic Data From Diffusion Models: Application to X-Ray ImagesAbdullah Hosseini (22466602)Ahmed Serag (2945643)Health sciencesHealth services and systemsInformation and computing sciencesArtificial intelligenceCybersecurity and privacyMachine learningArtificial intelligencebiomedical imagingdeep learningdiffusion probabilisticself-supervised learningsynthetic dataSynthetic dataData modelsTrainingBiological system modelingImage segmentationX-ray imagingDiffusion processesDiffusion modelsBiomarkersMedical diagnostic imaging<p dir="ltr">Synthetic data offers a compelling solution to the challenges associated with acquiring high-quality medical data, which is often constrained by privacy concerns and limited accessibility. This study explores the efficacy of synthetic data generated using diffusion models for training deep learning models within a self-supervised learning framework. The primary objective is to evaluate whether synthetic data can effectively preserve critical medical biomarkers and support reliable downstream tasks such as classification and segmentation. Using chest X-ray images as a case study, the results reveal that models pretrained on synthetic data achieve performance comparable to or surpassing those pretrained on real data. Specifically, in pneumonia classification task, the model trained on synthetic data outperformed established benchmarks, achieving an Area Under the Curve of 99.1 and an F1-score of 96.1%. Similarly, for segmentation tasks, the model trained on synthetic data demonstrated robust performance, attaining a Dice score of 0.85. These findings underscore a significant advancement in the generation of synthetic medical images, providing a viable approach to creating realistic, biomarker-preserving datasets that ensure patient confidentiality and enable diverse applications in medical imaging.</p><h2>Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2025.3555619" target="_blank">https://dx.doi.org/10.1109/access.2025.3555619</a></p>2025-04-09T06:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/access.2025.3555619https://figshare.com/articles/journal_contribution/Self-Supervised_Learning_Powered_by_Synthetic_Data_From_Diffusion_Models_Application_to_X-Ray_Images/30405526CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/304055262025-04-09T06:00:00Z |
| spellingShingle | Self-Supervised Learning Powered by Synthetic Data From Diffusion Models: Application to X-Ray Images Abdullah Hosseini (22466602) Health sciences Health services and systems Information and computing sciences Artificial intelligence Cybersecurity and privacy Machine learning Artificial intelligence biomedical imaging deep learning diffusion probabilistic self-supervised learning synthetic data Synthetic data Data models Training Biological system modeling Image segmentation X-ray imaging Diffusion processes Diffusion models Biomarkers Medical diagnostic imaging |
| status_str | publishedVersion |
| title | Self-Supervised Learning Powered by Synthetic Data From Diffusion Models: Application to X-Ray Images |
| title_full | Self-Supervised Learning Powered by Synthetic Data From Diffusion Models: Application to X-Ray Images |
| title_fullStr | Self-Supervised Learning Powered by Synthetic Data From Diffusion Models: Application to X-Ray Images |
| title_full_unstemmed | Self-Supervised Learning Powered by Synthetic Data From Diffusion Models: Application to X-Ray Images |
| title_short | Self-Supervised Learning Powered by Synthetic Data From Diffusion Models: Application to X-Ray Images |
| title_sort | Self-Supervised Learning Powered by Synthetic Data From Diffusion Models: Application to X-Ray Images |
| topic | Health sciences Health services and systems Information and computing sciences Artificial intelligence Cybersecurity and privacy Machine learning Artificial intelligence biomedical imaging deep learning diffusion probabilistic self-supervised learning synthetic data Synthetic data Data models Training Biological system modeling Image segmentation X-ray imaging Diffusion processes Diffusion models Biomarkers Medical diagnostic imaging |