Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting

<p>Given the ever increasing amount of publicly available social media data, there is growing interest in using online data to study and quantify phenomena in the offline “real” world. As social media data can be obtained in near real-time and at low cost, it is often used for “now-casting” in...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Jisun An (10230800) (author)
مؤلفون آخرون: Ingmar Weber (149886) (author)
منشور في: 2015
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513557671968768
author Jisun An (10230800)
author2 Ingmar Weber (149886)
author2_role author
author_facet Jisun An (10230800)
Ingmar Weber (149886)
author_role author
dc.creator.none.fl_str_mv Jisun An (10230800)
Ingmar Weber (149886)
dc.date.none.fl_str_mv 2015-11-30T09:00:00Z
dc.identifier.none.fl_str_mv 10.1140/epjds/s13688-015-0058-9
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Whom_should_we_sense_in_social_sensing_-_analyzing_which_users_work_best_for_social_media_now-casting/27045013
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information and computing sciences
Data management and data science
Human-centred computing
Machine learning
nowcasting
sampling
social media
Twitter
prediction
unemployment rate
flu
dc.title.none.fl_str_mv Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p>Given the ever increasing amount of publicly available social media data, there is growing interest in using online data to study and quantify phenomena in the offline “real” world. As social media data can be obtained in near real-time and at low cost, it is often used for “now-casting” indices such as levels of flu activity or unemployment. The term “social sensing” is often used in this context to describe the idea that users act as “sensors”, publicly reporting their health status or job losses. Sensor activity during a time period is then typically aggregated in a “one tweet, one vote” fashion by simply counting. At the same time, researchers readily admit that social media users are not a perfect representation of the actual population. Additionally, users differ in the amount of details of their personal lives that they reveal. Intuitively, it should be possible to improve now-casting by assigning different weights to different user groups. In this paper, we ask “How does social sensing actually work?” or, more precisely, “Whom should we sense-and whom not-for optimal results?”. We investigate how different sampling strategies affect the performance of now-casting of two common offline indices: flu activity and unemployment rate. We show that now-casting can be improved by (1) applying user filtering techniques and (2) selecting users with complete profiles. We also find that, using the right type of user groups, now-casting performance does not degrade, even when drastically reducing the size of the dataset. More fundamentally, we describe which type of users contribute most to the accuracy by asking if “babblers are better”. We conclude the paper by providing guidance on how to select better user groups for more accurate now-casting.</p><h2>Other Information</h2> <p> Published in: EPJ Data Science<br> License: <a href="http://creativecommons.org/licenses/by/4.0" target="_blank">http://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1140/epjds/s13688-015-0058-9" target="_blank">https://dx.doi.org/10.1140/epjds/s13688-015-0058-9</a></p>
eu_rights_str_mv openAccess
id Manara2_69031e5976e5307a455bb86cd2a185be
identifier_str_mv 10.1140/epjds/s13688-015-0058-9
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/27045013
publishDate 2015
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Whom should we sense in “social sensing” - analyzing which users work best for social media now-castingJisun An (10230800)Ingmar Weber (149886)Information and computing sciencesData management and data scienceHuman-centred computingMachine learningnowcastingsamplingsocial mediaTwitterpredictionunemployment rateflu<p>Given the ever increasing amount of publicly available social media data, there is growing interest in using online data to study and quantify phenomena in the offline “real” world. As social media data can be obtained in near real-time and at low cost, it is often used for “now-casting” indices such as levels of flu activity or unemployment. The term “social sensing” is often used in this context to describe the idea that users act as “sensors”, publicly reporting their health status or job losses. Sensor activity during a time period is then typically aggregated in a “one tweet, one vote” fashion by simply counting. At the same time, researchers readily admit that social media users are not a perfect representation of the actual population. Additionally, users differ in the amount of details of their personal lives that they reveal. Intuitively, it should be possible to improve now-casting by assigning different weights to different user groups. In this paper, we ask “How does social sensing actually work?” or, more precisely, “Whom should we sense-and whom not-for optimal results?”. We investigate how different sampling strategies affect the performance of now-casting of two common offline indices: flu activity and unemployment rate. We show that now-casting can be improved by (1) applying user filtering techniques and (2) selecting users with complete profiles. We also find that, using the right type of user groups, now-casting performance does not degrade, even when drastically reducing the size of the dataset. More fundamentally, we describe which type of users contribute most to the accuracy by asking if “babblers are better”. We conclude the paper by providing guidance on how to select better user groups for more accurate now-casting.</p><h2>Other Information</h2> <p> Published in: EPJ Data Science<br> License: <a href="http://creativecommons.org/licenses/by/4.0" target="_blank">http://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1140/epjds/s13688-015-0058-9" target="_blank">https://dx.doi.org/10.1140/epjds/s13688-015-0058-9</a></p>2015-11-30T09:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1140/epjds/s13688-015-0058-9https://figshare.com/articles/journal_contribution/Whom_should_we_sense_in_social_sensing_-_analyzing_which_users_work_best_for_social_media_now-casting/27045013CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/270450132015-11-30T09:00:00Z
spellingShingle Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting
Jisun An (10230800)
Information and computing sciences
Data management and data science
Human-centred computing
Machine learning
nowcasting
sampling
social media
Twitter
prediction
unemployment rate
flu
status_str publishedVersion
title Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting
title_full Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting
title_fullStr Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting
title_full_unstemmed Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting
title_short Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting
title_sort Whom should we sense in “social sensing” - analyzing which users work best for social media now-casting
topic Information and computing sciences
Data management and data science
Human-centred computing
Machine learning
nowcasting
sampling
social media
Twitter
prediction
unemployment rate
flu