Estimating Homophily in Social Networks Using Dyadic Predictions
<p dir="ltr">Predictions of node categories are commonly used to estimate homophily and other relational properties in networks. However, little is known about the validity of using predictions for this task. We show that estimating homophily in a network is a problem of predicting c...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| مؤلفون آخرون: | , , , |
| منشور في: |
2021
|
| الموضوعات: | |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1864513517639434240 |
|---|---|
| author | George Berry (5807582) |
| author2 | Antonio Sirianni (18475086) Ingmar Weber (690827) Jisun An (10230800) Michael Macy (4039937) |
| author2_role | author author author author |
| author_facet | George Berry (5807582) Antonio Sirianni (18475086) Ingmar Weber (690827) Jisun An (10230800) Michael Macy (4039937) |
| author_role | author |
| dc.creator.none.fl_str_mv | George Berry (5807582) Antonio Sirianni (18475086) Ingmar Weber (690827) Jisun An (10230800) Michael Macy (4039937) |
| dc.date.none.fl_str_mv | 2021-08-02T15:00:00Z |
| dc.identifier.none.fl_str_mv | 10.15195/v8.a14 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/Estimating_Homophily_in_Social_Networks_Using_Dyadic_Predictions/25730298 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Information and computing sciences Distributed computing and systems software Human-centred computing Machine learning homophily networks machine learning quantitative methodology |
| dc.title.none.fl_str_mv | Estimating Homophily in Social Networks Using Dyadic Predictions |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <p dir="ltr">Predictions of node categories are commonly used to estimate homophily and other relational properties in networks. However, little is known about the validity of using predictions for this task. We show that estimating homophily in a network is a problem of predicting categories of dyads (edges) in the graph. Homophily estimates are unbiased when predictions of dyad categories are unbiased. Node-level prediction models, such as the use of names to classify ethnicity or gender, do not generally produce unbiased predictions of dyad categories and therefore produce biased homophily estimates. Bias comes from three sources: sampling bias, correlation between model errors and node degree, and correlation between node-level model errors along dyads. We examine three methods for estimating homophily: predicting node categories, predicting dyad categories, and a hybrid “ego–alter” approach. This analysis indicates that only the dyadic prediction approach is unbiased, whereas the node-level approach produces both high bias and high overall error. We find that node-level classification performance is not a reliable indicator of accuracy for homophily. Although this article focuses on a particular version of homophily, results generalize to heterophilous cases and other dyadic measures. We conclude with suggestions for research design. Code for this article is available at https://github.com/georgeberry/autocorr.</p><h2>Other Information</h2><p dir="ltr">Published in: Sociological Science<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://doi.org/10.15195/v8.a14" target="_blank">https://doi.org/10.15195/v8.a14</a></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_758131c3a59adc2eb1868b4683f74773 |
| identifier_str_mv | 10.15195/v8.a14 |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/25730298 |
| publishDate | 2021 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Estimating Homophily in Social Networks Using Dyadic PredictionsGeorge Berry (5807582)Antonio Sirianni (18475086)Ingmar Weber (690827)Jisun An (10230800)Michael Macy (4039937)Information and computing sciencesDistributed computing and systems softwareHuman-centred computingMachine learninghomophilynetworksmachine learningquantitative methodology<p dir="ltr">Predictions of node categories are commonly used to estimate homophily and other relational properties in networks. However, little is known about the validity of using predictions for this task. We show that estimating homophily in a network is a problem of predicting categories of dyads (edges) in the graph. Homophily estimates are unbiased when predictions of dyad categories are unbiased. Node-level prediction models, such as the use of names to classify ethnicity or gender, do not generally produce unbiased predictions of dyad categories and therefore produce biased homophily estimates. Bias comes from three sources: sampling bias, correlation between model errors and node degree, and correlation between node-level model errors along dyads. We examine three methods for estimating homophily: predicting node categories, predicting dyad categories, and a hybrid “ego–alter” approach. This analysis indicates that only the dyadic prediction approach is unbiased, whereas the node-level approach produces both high bias and high overall error. We find that node-level classification performance is not a reliable indicator of accuracy for homophily. Although this article focuses on a particular version of homophily, results generalize to heterophilous cases and other dyadic measures. We conclude with suggestions for research design. Code for this article is available at https://github.com/georgeberry/autocorr.</p><h2>Other Information</h2><p dir="ltr">Published in: Sociological Science<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://doi.org/10.15195/v8.a14" target="_blank">https://doi.org/10.15195/v8.a14</a></p>2021-08-02T15:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.15195/v8.a14https://figshare.com/articles/journal_contribution/Estimating_Homophily_in_Social_Networks_Using_Dyadic_Predictions/25730298CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/257302982021-08-02T15:00:00Z |
| spellingShingle | Estimating Homophily in Social Networks Using Dyadic Predictions George Berry (5807582) Information and computing sciences Distributed computing and systems software Human-centred computing Machine learning homophily networks machine learning quantitative methodology |
| status_str | publishedVersion |
| title | Estimating Homophily in Social Networks Using Dyadic Predictions |
| title_full | Estimating Homophily in Social Networks Using Dyadic Predictions |
| title_fullStr | Estimating Homophily in Social Networks Using Dyadic Predictions |
| title_full_unstemmed | Estimating Homophily in Social Networks Using Dyadic Predictions |
| title_short | Estimating Homophily in Social Networks Using Dyadic Predictions |
| title_sort | Estimating Homophily in Social Networks Using Dyadic Predictions |
| topic | Information and computing sciences Distributed computing and systems software Human-centred computing Machine learning homophily networks machine learning quantitative methodology |