The illusion of data validity: Why numbers about people are likely wrong

<p>This reflection article addresses a difficulty faced by scholars and practitioners working with numbers about people, which is that those who study people want numerical data about these people. Unfortunately, time and time again, this numerical data about people is wrong. Addressing the po...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Bernard J. Jansen (7434779) (author)
مؤلفون آخرون: Joni Salminen (7434770) (author), Soon-gyo Jung (7434773) (author), Hind Almerekhi (7434776) (author)
منشور في: 2022
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513519101149184
author Bernard J. Jansen (7434779)
author2 Joni Salminen (7434770)
Soon-gyo Jung (7434773)
Hind Almerekhi (7434776)
author2_role author
author
author
author_facet Bernard J. Jansen (7434779)
Joni Salminen (7434770)
Soon-gyo Jung (7434773)
Hind Almerekhi (7434776)
author_role author
dc.creator.none.fl_str_mv Bernard J. Jansen (7434779)
Joni Salminen (7434770)
Soon-gyo Jung (7434773)
Hind Almerekhi (7434776)
dc.date.none.fl_str_mv 2022-10-01T00:00:00Z
dc.identifier.none.fl_str_mv 10.1016/j.dim.2022.100020
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/The_illusion_of_data_validity_Why_numbers_about_people_are_likely_wrong/25658904
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information and computing sciences
Library and information studies
People data
Measurement
Quantitative paradigm
Statistics
dc.title.none.fl_str_mv The illusion of data validity: Why numbers about people are likely wrong
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p>This reflection article addresses a difficulty faced by scholars and practitioners working with numbers about people, which is that those who study people want numerical data about these people. Unfortunately, time and time again, this numerical data about people is wrong. Addressing the potential causes of this wrongness, we present examples of analyzing people numbers, i.e., numbers derived from digital data by or about people, and discuss the comforting illusion of data validity. We first lay a foundation by highlighting potential inaccuracies in collecting people data, such as selection bias. Then, we discuss inaccuracies in analyzing people data, such as the flaw of averages, followed by a discussion of errors that are made when trying to make sense of people data through techniques such as posterior labeling. Finally, we discuss a root cause of people data often being wrong – the conceptual conundrum of thinking the numbers are counts when they are actually measures. Practical solutions to address this illusion of data validity are proposed. The implications for theories derived from people data are also highlighted, namely that these people theories are generally wrong as they are often derived from people numbers that are wrong.</p><h2>Other Information</h2> <p> Published in: Data and Information Management<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.dim.2022.100020" target="_blank">https://dx.doi.org/10.1016/j.dim.2022.100020</a></p>
eu_rights_str_mv openAccess
id Manara2_d2fcf002401900c2ab9d2b2d46332779
identifier_str_mv 10.1016/j.dim.2022.100020
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/25658904
publishDate 2022
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling The illusion of data validity: Why numbers about people are likely wrongBernard J. Jansen (7434779)Joni Salminen (7434770)Soon-gyo Jung (7434773)Hind Almerekhi (7434776)Information and computing sciencesLibrary and information studiesPeople dataMeasurementQuantitative paradigmStatistics<p>This reflection article addresses a difficulty faced by scholars and practitioners working with numbers about people, which is that those who study people want numerical data about these people. Unfortunately, time and time again, this numerical data about people is wrong. Addressing the potential causes of this wrongness, we present examples of analyzing people numbers, i.e., numbers derived from digital data by or about people, and discuss the comforting illusion of data validity. We first lay a foundation by highlighting potential inaccuracies in collecting people data, such as selection bias. Then, we discuss inaccuracies in analyzing people data, such as the flaw of averages, followed by a discussion of errors that are made when trying to make sense of people data through techniques such as posterior labeling. Finally, we discuss a root cause of people data often being wrong – the conceptual conundrum of thinking the numbers are counts when they are actually measures. Practical solutions to address this illusion of data validity are proposed. The implications for theories derived from people data are also highlighted, namely that these people theories are generally wrong as they are often derived from people numbers that are wrong.</p><h2>Other Information</h2> <p> Published in: Data and Information Management<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.dim.2022.100020" target="_blank">https://dx.doi.org/10.1016/j.dim.2022.100020</a></p>2022-10-01T00:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1016/j.dim.2022.100020https://figshare.com/articles/journal_contribution/The_illusion_of_data_validity_Why_numbers_about_people_are_likely_wrong/25658904CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/256589042022-10-01T00:00:00Z
spellingShingle The illusion of data validity: Why numbers about people are likely wrong
Bernard J. Jansen (7434779)
Information and computing sciences
Library and information studies
People data
Measurement
Quantitative paradigm
Statistics
status_str publishedVersion
title The illusion of data validity: Why numbers about people are likely wrong
title_full The illusion of data validity: Why numbers about people are likely wrong
title_fullStr The illusion of data validity: Why numbers about people are likely wrong
title_full_unstemmed The illusion of data validity: Why numbers about people are likely wrong
title_short The illusion of data validity: Why numbers about people are likely wrong
title_sort The illusion of data validity: Why numbers about people are likely wrong
topic Information and computing sciences
Library and information studies
People data
Measurement
Quantitative paradigm
Statistics