StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic

<p dir="ltr">In this paper, we present StEduCov, an annotated dataset for the analysis of stances toward online education during the COVID-19 pandemic. StEduCov consists of 16,572 tweets gathered over 15 months, from March 2020 to May 2021, using the Twitter API. The tweets were manu...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Omama Hamad (21363476) (author)
مؤلفون آخرون: Ali Hamdi (13432680) (author), Sayed Hamdi (21363479) (author), Khaled Shaban (20074425) (author)
منشور في: 2022
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513548270436352
author Omama Hamad (21363476)
author2 Ali Hamdi (13432680)
Sayed Hamdi (21363479)
Khaled Shaban (20074425)
author2_role author
author
author
author_facet Omama Hamad (21363476)
Ali Hamdi (13432680)
Sayed Hamdi (21363479)
Khaled Shaban (20074425)
author_role author
dc.creator.none.fl_str_mv Omama Hamad (21363476)
Ali Hamdi (13432680)
Sayed Hamdi (21363479)
Khaled Shaban (20074425)
dc.date.none.fl_str_mv 2022-08-22T09:00:00Z
dc.identifier.none.fl_str_mv 10.3390/bdcc6030088
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/StEduCov_An_Explored_and_Benchmarked_Dataset_on_Stance_Detection_in_Tweets_towards_Online_Education_during_COVID-19_Pandemic/29069558
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Education
Curriculum and pedagogy
Education systems
Information and computing sciences
Human-centred computing
Machine learning
text classification
stance detection
deep learning
transfer learning
COVID-19 pandemic
dc.title.none.fl_str_mv StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">In this paper, we present StEduCov, an annotated dataset for the analysis of stances toward online education during the COVID-19 pandemic. StEduCov consists of 16,572 tweets gathered over 15 months, from March 2020 to May 2021, using the Twitter API. The tweets were manually annotated into the classes agree, disagreeor neutral. We performed benchmarking on the dataset using state-of-the-art and traditional machine learning models. Specifically, we trained deep learning models—bidirectional encoder representations from transformers, long short-term memory, convolutional neural networks, attention-based biLSTM and Naive Bayes SVM—in addition to naive Bayes, logistic regression, support vector machines, decision trees, K-nearest neighbor and random forest. The average accuracy in the 10-fold cross-validation of these models ranged from 75% to 84.8% and from 52.6% to 68% for binary and multi-class stance classifications, respectively. Performances were affected by high vocabulary overlaps between classes and unreliable transfer learning using deep models pre-trained on general texts in relation to specific domains such as COVID-19 and distance education.</p><h2>Other Information</h2><p dir="ltr">Published in: Big Data and Cognitive Computing<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.3390/bdcc6030088" target="_blank">https://dx.doi.org/10.3390/bdcc6030088</a></p>
eu_rights_str_mv openAccess
id Manara2_751c7048e99950db9c778a794f4d7744
identifier_str_mv 10.3390/bdcc6030088
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/29069558
publishDate 2022
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 PandemicOmama Hamad (21363476)Ali Hamdi (13432680)Sayed Hamdi (21363479)Khaled Shaban (20074425)EducationCurriculum and pedagogyEducation systemsInformation and computing sciencesHuman-centred computingMachine learningtext classificationstance detectiondeep learningtransfer learningCOVID-19 pandemic<p dir="ltr">In this paper, we present StEduCov, an annotated dataset for the analysis of stances toward online education during the COVID-19 pandemic. StEduCov consists of 16,572 tweets gathered over 15 months, from March 2020 to May 2021, using the Twitter API. The tweets were manually annotated into the classes agree, disagreeor neutral. We performed benchmarking on the dataset using state-of-the-art and traditional machine learning models. Specifically, we trained deep learning models—bidirectional encoder representations from transformers, long short-term memory, convolutional neural networks, attention-based biLSTM and Naive Bayes SVM—in addition to naive Bayes, logistic regression, support vector machines, decision trees, K-nearest neighbor and random forest. The average accuracy in the 10-fold cross-validation of these models ranged from 75% to 84.8% and from 52.6% to 68% for binary and multi-class stance classifications, respectively. Performances were affected by high vocabulary overlaps between classes and unreliable transfer learning using deep models pre-trained on general texts in relation to specific domains such as COVID-19 and distance education.</p><h2>Other Information</h2><p dir="ltr">Published in: Big Data and Cognitive Computing<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.3390/bdcc6030088" target="_blank">https://dx.doi.org/10.3390/bdcc6030088</a></p>2022-08-22T09:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.3390/bdcc6030088https://figshare.com/articles/journal_contribution/StEduCov_An_Explored_and_Benchmarked_Dataset_on_Stance_Detection_in_Tweets_towards_Online_Education_during_COVID-19_Pandemic/29069558CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/290695582022-08-22T09:00:00Z
spellingShingle StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic
Omama Hamad (21363476)
Education
Curriculum and pedagogy
Education systems
Information and computing sciences
Human-centred computing
Machine learning
text classification
stance detection
deep learning
transfer learning
COVID-19 pandemic
status_str publishedVersion
title StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic
title_full StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic
title_fullStr StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic
title_full_unstemmed StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic
title_short StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic
title_sort StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic
topic Education
Curriculum and pedagogy
Education systems
Information and computing sciences
Human-centred computing
Machine learning
text classification
stance detection
deep learning
transfer learning
COVID-19 pandemic