Centroid inter-cluster distance.

<div><p>This work investigates the trade-off between data anonymization and utility, particularly focusing on the implications for equity-related research in education. Using microdata from the 2019 Brazilian National Student Performance Exam (ENADE), the study applies the (ε, δ)-Differe...

Full description

Saved in:
Bibliographic Details
Main Author: Paulo Fazendeiro (12233780) (author)
Other Authors: Paula Prata (12233777) (author), Maria Eugénia Ferrão (12233774) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1852015945803890688
author Paulo Fazendeiro (12233780)
author2 Paula Prata (12233777)
Maria Eugénia Ferrão (12233774)
author2_role author
author
author_facet Paulo Fazendeiro (12233780)
Paula Prata (12233777)
Maria Eugénia Ferrão (12233774)
author_role author
dc.creator.none.fl_str_mv Paulo Fazendeiro (12233780)
Paula Prata (12233777)
Maria Eugénia Ferrão (12233774)
dc.date.none.fl_str_mv 2025-10-08T17:25:18Z
dc.identifier.none.fl_str_mv 10.1371/journal.pone.0332441.t002
dc.relation.none.fl_str_mv https://figshare.com/articles/dataset/Centroid_inter-cluster_distance_/30308359
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Ecology
Science Policy
Environmental Sciences not elsewhere classified
Biological Sciences not elsewhere classified
students &# 8217
involving domain experts
dataset &# 8217
subtle biases introduced
group categories related
driven policies aimed
prevent anonymization efforts
educational equity analysis
studies aimed
related research
introducing biases
educational equity
xlink ">
work investigates
using microdata
study concludes
study applies
sociodemographic variables
results reveal
research evaluates
minority groups
may jeopardise
finding highlights
equity studies
enade ),
economic inequalities
could undermine
careful attention
anonymized datasets
anonymization techniques
anonymization process
also lead
dc.title.none.fl_str_mv Centroid inter-cluster distance.
dc.type.none.fl_str_mv Dataset
info:eu-repo/semantics/publishedVersion
dataset
description <div><p>This work investigates the trade-off between data anonymization and utility, particularly focusing on the implications for equity-related research in education. Using microdata from the 2019 Brazilian National Student Performance Exam (ENADE), the study applies the (ε, δ)-Differential Privacy model to explore the impact of anonymization on the dataset’s utility for socio-educational equity analysis. By clustering both the original and anonymized datasets, the research evaluates how group categories related to students’ sociodemographic variables, such as gender, race, income, and parental education, are affected by the anonymization process. The results reveal that while anonymization techniques can preserve overall data structure, they can also lead to the suppression or misrepresentation of minority groups, introducing biases that may jeopardise the promotion of educational equity. This finding highlights the importance of involving domain experts in the interpretation of anonymized data, particularly in studies aimed at reducing socio-economic inequalities. The study concludes that careful attention is needed to prevent anonymization efforts from distorting key group categories, which could undermine the validity of data-driven policies aimed at promoting equity.</p></div>
eu_rights_str_mv openAccess
id Manara_6cf9b2376b50bd2a9a6772bdf0c9dec4
identifier_str_mv 10.1371/journal.pone.0332441.t002
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/30308359
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Centroid inter-cluster distance.Paulo Fazendeiro (12233780)Paula Prata (12233777)Maria Eugénia Ferrão (12233774)EcologyScience PolicyEnvironmental Sciences not elsewhere classifiedBiological Sciences not elsewhere classifiedstudents &# 8217involving domain expertsdataset &# 8217subtle biases introducedgroup categories relateddriven policies aimedprevent anonymization effortseducational equity analysisstudies aimedrelated researchintroducing biaseseducational equityxlink ">work investigatesusing microdatastudy concludesstudy appliessociodemographic variablesresults revealresearch evaluatesminority groupsmay jeopardisefinding highlightsequity studiesenade ),economic inequalitiescould underminecareful attentionanonymized datasetsanonymization techniquesanonymization processalso lead<div><p>This work investigates the trade-off between data anonymization and utility, particularly focusing on the implications for equity-related research in education. Using microdata from the 2019 Brazilian National Student Performance Exam (ENADE), the study applies the (ε, δ)-Differential Privacy model to explore the impact of anonymization on the dataset’s utility for socio-educational equity analysis. By clustering both the original and anonymized datasets, the research evaluates how group categories related to students’ sociodemographic variables, such as gender, race, income, and parental education, are affected by the anonymization process. The results reveal that while anonymization techniques can preserve overall data structure, they can also lead to the suppression or misrepresentation of minority groups, introducing biases that may jeopardise the promotion of educational equity. This finding highlights the importance of involving domain experts in the interpretation of anonymized data, particularly in studies aimed at reducing socio-economic inequalities. The study concludes that careful attention is needed to prevent anonymization efforts from distorting key group categories, which could undermine the validity of data-driven policies aimed at promoting equity.</p></div>2025-10-08T17:25:18ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.1371/journal.pone.0332441.t002https://figshare.com/articles/dataset/Centroid_inter-cluster_distance_/30308359CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/303083592025-10-08T17:25:18Z
spellingShingle Centroid inter-cluster distance.
Paulo Fazendeiro (12233780)
Ecology
Science Policy
Environmental Sciences not elsewhere classified
Biological Sciences not elsewhere classified
students &# 8217
involving domain experts
dataset &# 8217
subtle biases introduced
group categories related
driven policies aimed
prevent anonymization efforts
educational equity analysis
studies aimed
related research
introducing biases
educational equity
xlink ">
work investigates
using microdata
study concludes
study applies
sociodemographic variables
results reveal
research evaluates
minority groups
may jeopardise
finding highlights
equity studies
enade ),
economic inequalities
could undermine
careful attention
anonymized datasets
anonymization techniques
anonymization process
also lead
status_str publishedVersion
title Centroid inter-cluster distance.
title_full Centroid inter-cluster distance.
title_fullStr Centroid inter-cluster distance.
title_full_unstemmed Centroid inter-cluster distance.
title_short Centroid inter-cluster distance.
title_sort Centroid inter-cluster distance.
topic Ecology
Science Policy
Environmental Sciences not elsewhere classified
Biological Sciences not elsewhere classified
students &# 8217
involving domain experts
dataset &# 8217
subtle biases introduced
group categories related
driven policies aimed
prevent anonymization efforts
educational equity analysis
studies aimed
related research
introducing biases
educational equity
xlink ">
work investigates
using microdata
study concludes
study applies
sociodemographic variables
results reveal
research evaluates
minority groups
may jeopardise
finding highlights
equity studies
enade ),
economic inequalities
could undermine
careful attention
anonymized datasets
anonymization techniques
anonymization process
also lead