Cluster-based anonymity model and algorithm for 1:1 dataset with a single sensitive attribute using machine learning technique

<p>Privacy is a significant issue that requires consideration in all applications. Data collected from various individuals and organizations must be disclosed to the public or private parties for analysis and research purposes. The collected data are studied and analyzed digitally for the extr...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: J. Jayapradha (21323774) (author)
مؤلفون آخرون: Ghaida Muttashar Abdulsahib (17541243) (author), Osamah Ibrahim Khalaf (17541255) (author), M. Prakash (1464505) (author), Mueen Uddin (4903510) (author), Maha Abdelhaq (735574) (author), Raed Alsaqour (735575) (author)
منشور في: 2024
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513548229541888
author J. Jayapradha (21323774)
author2 Ghaida Muttashar Abdulsahib (17541243)
Osamah Ibrahim Khalaf (17541255)
M. Prakash (1464505)
Mueen Uddin (4903510)
Maha Abdelhaq (735574)
Raed Alsaqour (735575)
author2_role author
author
author
author
author
author
author_facet J. Jayapradha (21323774)
Ghaida Muttashar Abdulsahib (17541243)
Osamah Ibrahim Khalaf (17541255)
M. Prakash (1464505)
Mueen Uddin (4903510)
Maha Abdelhaq (735574)
Raed Alsaqour (735575)
author_role author
dc.creator.none.fl_str_mv J. Jayapradha (21323774)
Ghaida Muttashar Abdulsahib (17541243)
Osamah Ibrahim Khalaf (17541255)
M. Prakash (1464505)
Mueen Uddin (4903510)
Maha Abdelhaq (735574)
Raed Alsaqour (735575)
dc.date.none.fl_str_mv 2024-06-13T03:00:00Z
dc.identifier.none.fl_str_mv 10.1016/j.eij.2024.100485
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Cluster-based_anonymity_model_and_algorithm_for_1_1_dataset_with_a_single_sensitive_attribute_using_machine_learning_technique/29022191
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Health sciences
Health services and systems
Human society
Development studies
Information and computing sciences
Applied computing
Cybersecurity and privacy
Data management and data science
Privacy-preserving
Semi-sensitive attribute
Fuzzy c-means clustering
Identity disclosure
Attribute disclosure
Membership disclosure
Data privacy and utility
dc.title.none.fl_str_mv Cluster-based anonymity model and algorithm for 1:1 dataset with a single sensitive attribute using machine learning technique
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p>Privacy is a significant issue that requires consideration in all applications. Data collected from various individuals and organizations must be disclosed to the public or private parties for analysis and research purposes. The collected data are studied and analyzed digitally for the extraction of various useful patterns for decision-making research purposes. Privacy-preserving data publishing is significant as privacy violations in the patient’s data may have an adverse effect on the individual positive reputation. An efficient Cluster Based anonymity model has been proposed to anonymizes the 1:1 dataset with a single sensitive attribute through the introduction of a concept named “Semi-sensitive attribute.” Based on correlation, the attributes are categorized as quasi-identifier and semi-sensitive attributes. The k-anonymity is implemented on the quasi-identifier with the semi-sensitive attribute table and Fuzzy c-means clustering has been implemented to fix a range of values for anonymizing the semi-sensitive attributes. The disease is considered a sensitive attribute as the research work focuses on the medical dataset. The proposed model is demonstrated to resist the three privacy attacks such as, i)Identity Disclosure, ii) Attribute Disclosure, and iii) Membership Disclosure. The utility loss is calculated for each row and utility loss of each record are aggregated and considered as the total information loss for each attribute. Cluster Based anonymity model measured the utility loss for all the attributes and the average utility loss for the anonymized patient dataset is 3.78%.</p><h2>Other Information</h2> <p> Published in: Egyptian Informatics Journal<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.eij.2024.100485" target="_blank">https://dx.doi.org/10.1016/j.eij.2024.100485</a></p>
eu_rights_str_mv openAccess
id Manara2_8c2c75ff14450f721f36b6eb9d4bc709
identifier_str_mv 10.1016/j.eij.2024.100485
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/29022191
publishDate 2024
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Cluster-based anonymity model and algorithm for 1:1 dataset with a single sensitive attribute using machine learning techniqueJ. Jayapradha (21323774)Ghaida Muttashar Abdulsahib (17541243)Osamah Ibrahim Khalaf (17541255)M. Prakash (1464505)Mueen Uddin (4903510)Maha Abdelhaq (735574)Raed Alsaqour (735575)Health sciencesHealth services and systemsHuman societyDevelopment studiesInformation and computing sciencesApplied computingCybersecurity and privacyData management and data sciencePrivacy-preservingSemi-sensitive attributeFuzzy c-means clusteringIdentity disclosureAttribute disclosureMembership disclosureData privacy and utility<p>Privacy is a significant issue that requires consideration in all applications. Data collected from various individuals and organizations must be disclosed to the public or private parties for analysis and research purposes. The collected data are studied and analyzed digitally for the extraction of various useful patterns for decision-making research purposes. Privacy-preserving data publishing is significant as privacy violations in the patient’s data may have an adverse effect on the individual positive reputation. An efficient Cluster Based anonymity model has been proposed to anonymizes the 1:1 dataset with a single sensitive attribute through the introduction of a concept named “Semi-sensitive attribute.” Based on correlation, the attributes are categorized as quasi-identifier and semi-sensitive attributes. The k-anonymity is implemented on the quasi-identifier with the semi-sensitive attribute table and Fuzzy c-means clustering has been implemented to fix a range of values for anonymizing the semi-sensitive attributes. The disease is considered a sensitive attribute as the research work focuses on the medical dataset. The proposed model is demonstrated to resist the three privacy attacks such as, i)Identity Disclosure, ii) Attribute Disclosure, and iii) Membership Disclosure. The utility loss is calculated for each row and utility loss of each record are aggregated and considered as the total information loss for each attribute. Cluster Based anonymity model measured the utility loss for all the attributes and the average utility loss for the anonymized patient dataset is 3.78%.</p><h2>Other Information</h2> <p> Published in: Egyptian Informatics Journal<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.eij.2024.100485" target="_blank">https://dx.doi.org/10.1016/j.eij.2024.100485</a></p>2024-06-13T03:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1016/j.eij.2024.100485https://figshare.com/articles/journal_contribution/Cluster-based_anonymity_model_and_algorithm_for_1_1_dataset_with_a_single_sensitive_attribute_using_machine_learning_technique/29022191CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/290221912024-06-13T03:00:00Z
spellingShingle Cluster-based anonymity model and algorithm for 1:1 dataset with a single sensitive attribute using machine learning technique
J. Jayapradha (21323774)
Health sciences
Health services and systems
Human society
Development studies
Information and computing sciences
Applied computing
Cybersecurity and privacy
Data management and data science
Privacy-preserving
Semi-sensitive attribute
Fuzzy c-means clustering
Identity disclosure
Attribute disclosure
Membership disclosure
Data privacy and utility
status_str publishedVersion
title Cluster-based anonymity model and algorithm for 1:1 dataset with a single sensitive attribute using machine learning technique
title_full Cluster-based anonymity model and algorithm for 1:1 dataset with a single sensitive attribute using machine learning technique
title_fullStr Cluster-based anonymity model and algorithm for 1:1 dataset with a single sensitive attribute using machine learning technique
title_full_unstemmed Cluster-based anonymity model and algorithm for 1:1 dataset with a single sensitive attribute using machine learning technique
title_short Cluster-based anonymity model and algorithm for 1:1 dataset with a single sensitive attribute using machine learning technique
title_sort Cluster-based anonymity model and algorithm for 1:1 dataset with a single sensitive attribute using machine learning technique
topic Health sciences
Health services and systems
Human society
Development studies
Information and computing sciences
Applied computing
Cybersecurity and privacy
Data management and data science
Privacy-preserving
Semi-sensitive attribute
Fuzzy c-means clustering
Identity disclosure
Attribute disclosure
Membership disclosure
Data privacy and utility