StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic
<p dir="ltr">In this paper, we present StEduCov, an annotated dataset for the analysis of stances toward online education during the COVID-19 pandemic. StEduCov consists of 16,572 tweets gathered over 15 months, from March 2020 to May 2021, using the Twitter API. The tweets were manu...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| مؤلفون آخرون: | , , |
| منشور في: |
2022
|
| الموضوعات: | |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1864513548270436352 |
|---|---|
| author | Omama Hamad (21363476) |
| author2 | Ali Hamdi (13432680) Sayed Hamdi (21363479) Khaled Shaban (20074425) |
| author2_role | author author author |
| author_facet | Omama Hamad (21363476) Ali Hamdi (13432680) Sayed Hamdi (21363479) Khaled Shaban (20074425) |
| author_role | author |
| dc.creator.none.fl_str_mv | Omama Hamad (21363476) Ali Hamdi (13432680) Sayed Hamdi (21363479) Khaled Shaban (20074425) |
| dc.date.none.fl_str_mv | 2022-08-22T09:00:00Z |
| dc.identifier.none.fl_str_mv | 10.3390/bdcc6030088 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/StEduCov_An_Explored_and_Benchmarked_Dataset_on_Stance_Detection_in_Tweets_towards_Online_Education_during_COVID-19_Pandemic/29069558 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Education Curriculum and pedagogy Education systems Information and computing sciences Human-centred computing Machine learning text classification stance detection deep learning transfer learning COVID-19 pandemic |
| dc.title.none.fl_str_mv | StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <p dir="ltr">In this paper, we present StEduCov, an annotated dataset for the analysis of stances toward online education during the COVID-19 pandemic. StEduCov consists of 16,572 tweets gathered over 15 months, from March 2020 to May 2021, using the Twitter API. The tweets were manually annotated into the classes agree, disagreeor neutral. We performed benchmarking on the dataset using state-of-the-art and traditional machine learning models. Specifically, we trained deep learning models—bidirectional encoder representations from transformers, long short-term memory, convolutional neural networks, attention-based biLSTM and Naive Bayes SVM—in addition to naive Bayes, logistic regression, support vector machines, decision trees, K-nearest neighbor and random forest. The average accuracy in the 10-fold cross-validation of these models ranged from 75% to 84.8% and from 52.6% to 68% for binary and multi-class stance classifications, respectively. Performances were affected by high vocabulary overlaps between classes and unreliable transfer learning using deep models pre-trained on general texts in relation to specific domains such as COVID-19 and distance education.</p><h2>Other Information</h2><p dir="ltr">Published in: Big Data and Cognitive Computing<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.3390/bdcc6030088" target="_blank">https://dx.doi.org/10.3390/bdcc6030088</a></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_751c7048e99950db9c778a794f4d7744 |
| identifier_str_mv | 10.3390/bdcc6030088 |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/29069558 |
| publishDate | 2022 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 PandemicOmama Hamad (21363476)Ali Hamdi (13432680)Sayed Hamdi (21363479)Khaled Shaban (20074425)EducationCurriculum and pedagogyEducation systemsInformation and computing sciencesHuman-centred computingMachine learningtext classificationstance detectiondeep learningtransfer learningCOVID-19 pandemic<p dir="ltr">In this paper, we present StEduCov, an annotated dataset for the analysis of stances toward online education during the COVID-19 pandemic. StEduCov consists of 16,572 tweets gathered over 15 months, from March 2020 to May 2021, using the Twitter API. The tweets were manually annotated into the classes agree, disagreeor neutral. We performed benchmarking on the dataset using state-of-the-art and traditional machine learning models. Specifically, we trained deep learning models—bidirectional encoder representations from transformers, long short-term memory, convolutional neural networks, attention-based biLSTM and Naive Bayes SVM—in addition to naive Bayes, logistic regression, support vector machines, decision trees, K-nearest neighbor and random forest. The average accuracy in the 10-fold cross-validation of these models ranged from 75% to 84.8% and from 52.6% to 68% for binary and multi-class stance classifications, respectively. Performances were affected by high vocabulary overlaps between classes and unreliable transfer learning using deep models pre-trained on general texts in relation to specific domains such as COVID-19 and distance education.</p><h2>Other Information</h2><p dir="ltr">Published in: Big Data and Cognitive Computing<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.3390/bdcc6030088" target="_blank">https://dx.doi.org/10.3390/bdcc6030088</a></p>2022-08-22T09:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.3390/bdcc6030088https://figshare.com/articles/journal_contribution/StEduCov_An_Explored_and_Benchmarked_Dataset_on_Stance_Detection_in_Tweets_towards_Online_Education_during_COVID-19_Pandemic/29069558CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/290695582022-08-22T09:00:00Z |
| spellingShingle | StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic Omama Hamad (21363476) Education Curriculum and pedagogy Education systems Information and computing sciences Human-centred computing Machine learning text classification stance detection deep learning transfer learning COVID-19 pandemic |
| status_str | publishedVersion |
| title | StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic |
| title_full | StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic |
| title_fullStr | StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic |
| title_full_unstemmed | StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic |
| title_short | StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic |
| title_sort | StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic |
| topic | Education Curriculum and pedagogy Education systems Information and computing sciences Human-centred computing Machine learning text classification stance detection deep learning transfer learning COVID-19 pandemic |