Estimating Homophily in Social Networks Using Dyadic Predictions

<p dir="ltr">Predictions of node categories are commonly used to estimate homophily and other relational properties in networks. However, little is known about the validity of using predictions for this task. We show that estimating homophily in a network is a problem of predicting c...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: George Berry (5807582) (author)
مؤلفون آخرون: Antonio Sirianni (18475086) (author), Ingmar Weber (690827) (author), Jisun An (10230800) (author), Michael Macy (4039937) (author)
منشور في: 2021
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513517639434240
author George Berry (5807582)
author2 Antonio Sirianni (18475086)
Ingmar Weber (690827)
Jisun An (10230800)
Michael Macy (4039937)
author2_role author
author
author
author
author_facet George Berry (5807582)
Antonio Sirianni (18475086)
Ingmar Weber (690827)
Jisun An (10230800)
Michael Macy (4039937)
author_role author
dc.creator.none.fl_str_mv George Berry (5807582)
Antonio Sirianni (18475086)
Ingmar Weber (690827)
Jisun An (10230800)
Michael Macy (4039937)
dc.date.none.fl_str_mv 2021-08-02T15:00:00Z
dc.identifier.none.fl_str_mv 10.15195/v8.a14
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Estimating_Homophily_in_Social_Networks_Using_Dyadic_Predictions/25730298
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information and computing sciences
Distributed computing and systems software
Human-centred computing
Machine learning
homophily
networks
machine learning
quantitative methodology
dc.title.none.fl_str_mv Estimating Homophily in Social Networks Using Dyadic Predictions
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">Predictions of node categories are commonly used to estimate homophily and other relational properties in networks. However, little is known about the validity of using predictions for this task. We show that estimating homophily in a network is a problem of predicting categories of dyads (edges) in the graph. Homophily estimates are unbiased when predictions of dyad categories are unbiased. Node-level prediction models, such as the use of names to classify ethnicity or gender, do not generally produce unbiased predictions of dyad categories and therefore produce biased homophily estimates. Bias comes from three sources: sampling bias, correlation between model errors and node degree, and correlation between node-level model errors along dyads. We examine three methods for estimating homophily: predicting node categories, predicting dyad categories, and a hybrid “ego–alter” approach. This analysis indicates that only the dyadic prediction approach is unbiased, whereas the node-level approach produces both high bias and high overall error. We find that node-level classification performance is not a reliable indicator of accuracy for homophily. Although this article focuses on a particular version of homophily, results generalize to heterophilous cases and other dyadic measures. We conclude with suggestions for research design. Code for this article is available at https://github.com/georgeberry/autocorr.</p><h2>Other Information</h2><p dir="ltr">Published in: Sociological Science<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://doi.org/10.15195/v8.a14" target="_blank">https://doi.org/10.15195/v8.a14</a></p>
eu_rights_str_mv openAccess
id Manara2_758131c3a59adc2eb1868b4683f74773
identifier_str_mv 10.15195/v8.a14
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/25730298
publishDate 2021
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Estimating Homophily in Social Networks Using Dyadic PredictionsGeorge Berry (5807582)Antonio Sirianni (18475086)Ingmar Weber (690827)Jisun An (10230800)Michael Macy (4039937)Information and computing sciencesDistributed computing and systems softwareHuman-centred computingMachine learninghomophilynetworksmachine learningquantitative methodology<p dir="ltr">Predictions of node categories are commonly used to estimate homophily and other relational properties in networks. However, little is known about the validity of using predictions for this task. We show that estimating homophily in a network is a problem of predicting categories of dyads (edges) in the graph. Homophily estimates are unbiased when predictions of dyad categories are unbiased. Node-level prediction models, such as the use of names to classify ethnicity or gender, do not generally produce unbiased predictions of dyad categories and therefore produce biased homophily estimates. Bias comes from three sources: sampling bias, correlation between model errors and node degree, and correlation between node-level model errors along dyads. We examine three methods for estimating homophily: predicting node categories, predicting dyad categories, and a hybrid “ego–alter” approach. This analysis indicates that only the dyadic prediction approach is unbiased, whereas the node-level approach produces both high bias and high overall error. We find that node-level classification performance is not a reliable indicator of accuracy for homophily. Although this article focuses on a particular version of homophily, results generalize to heterophilous cases and other dyadic measures. We conclude with suggestions for research design. Code for this article is available at https://github.com/georgeberry/autocorr.</p><h2>Other Information</h2><p dir="ltr">Published in: Sociological Science<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://doi.org/10.15195/v8.a14" target="_blank">https://doi.org/10.15195/v8.a14</a></p>2021-08-02T15:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.15195/v8.a14https://figshare.com/articles/journal_contribution/Estimating_Homophily_in_Social_Networks_Using_Dyadic_Predictions/25730298CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/257302982021-08-02T15:00:00Z
spellingShingle Estimating Homophily in Social Networks Using Dyadic Predictions
George Berry (5807582)
Information and computing sciences
Distributed computing and systems software
Human-centred computing
Machine learning
homophily
networks
machine learning
quantitative methodology
status_str publishedVersion
title Estimating Homophily in Social Networks Using Dyadic Predictions
title_full Estimating Homophily in Social Networks Using Dyadic Predictions
title_fullStr Estimating Homophily in Social Networks Using Dyadic Predictions
title_full_unstemmed Estimating Homophily in Social Networks Using Dyadic Predictions
title_short Estimating Homophily in Social Networks Using Dyadic Predictions
title_sort Estimating Homophily in Social Networks Using Dyadic Predictions
topic Information and computing sciences
Distributed computing and systems software
Human-centred computing
Machine learning
homophily
networks
machine learning
quantitative methodology