Numeric features of the dataset.
<div><p>Through the advancement of the contemporary web and the rapid adoption of social media platforms such as YouTube, Twitter, and Facebook, for example, life has become much easier when dealing with certain highly personal problems. The far-reaching consequences of online harassment...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | , , , |
| Published: |
2024
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1852026183691010048 |
|---|---|
| author | Khandaker Mohammad Mohi Uddin (13124356) |
| author2 | Hasibul Hamim (19797121) Mst. Nishat Tasnim Mim (19797124) Arnisha Akhter (16323159) Md Ashraf Uddin (14855902) |
| author2_role | author author author author |
| author_facet | Khandaker Mohammad Mohi Uddin (13124356) Hasibul Hamim (19797121) Mst. Nishat Tasnim Mim (19797124) Arnisha Akhter (16323159) Md Ashraf Uddin (14855902) |
| author_role | author |
| dc.creator.none.fl_str_mv | Khandaker Mohammad Mohi Uddin (13124356) Hasibul Hamim (19797121) Mst. Nishat Tasnim Mim (19797124) Arnisha Akhter (16323159) Md Ashraf Uddin (14855902) |
| dc.date.none.fl_str_mv | 2024-10-03T17:26:54Z |
| dc.identifier.none.fl_str_mv | 10.1371/journal.pone.0308862.t002 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/dataset/Numeric_features_of_the_dataset_/27162637 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Biological Sciences not elsewhere classified Information Systems not elsewhere classified using mathematical visualization stochastic gradient descent social media platforms safeguard psychological wellness reverse document frequency online bullying could natural language processing free online environment eliminate online harassment combined two sets become much easier cnn ), convolutional deep neural networks binary classification model categorize bengali comments 000 bengali comments bidirectional long short 34 %, precision mlp ), k bengali comments long short lstm ), hybrid model deep learning 33 %, xlink "> writing aims term memory reaching consequences rapid adoption producing 94 nearest neighbors machine learning layer perceptron large number large amount label class f1 score evaluation accuracy earlier stage different points detected quickly count vectorizer contemporary web based approach |
| dc.title.none.fl_str_mv | Numeric features of the dataset. |
| dc.type.none.fl_str_mv | Dataset info:eu-repo/semantics/publishedVersion dataset |
| description | <div><p>Through the advancement of the contemporary web and the rapid adoption of social media platforms such as YouTube, Twitter, and Facebook, for example, life has become much easier when dealing with certain highly personal problems. The far-reaching consequences of online harassment require immediate preventative steps to safeguard psychological wellness and scholarly achievement via detection at an earlier stage. This piece of writing aims to eliminate online harassment and create a criticism-free online environment. In the paper, we have used a variety of attributes to evaluate a large number of Bengali comments. We communicate cleansed data utilizing machine learning (ML) methods and natural language processing techniques, which must be followed using term frequency and reverse document frequency (TF-IDF) with a count vectorizer. In addition, we used tokenization with padding to feed our deep learning (DL) models. Using mathematical visualization and natural language processing, online bullying could be detected quickly. Multi-layer Perceptron (MLP), K-Nearest Neighbors (K-NN), Extreme Gradient Boosting (XGBoost), Adaptive Boosting Classifier (AdaBoost), Logistic Regression Classifier (LR), Random Forest Classifier (RF), Bagging Classifier, Stochastic Gradient Descent (SGD), Voting Classifier, and Stacking are employed in the research we conducted. We expanded our investigation to include different DL frameworks. Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Convolutional-Long Short-Term Memory (C-LSTM), and Bidirectional Long Short-Term Memory (BiLSTM) are all implemented. A large amount of data is required to precisely recognize harassing behavior. To rapidly recognize internet harassment written material, we combined two sets of data, producing 94,000 Bengali comments from different points of view. After understanding the ML and DL models, we can see that a hybrid model (MLP+SGD+LR) performed more effectively when compared to other models, its evaluation accuracy is 99.34%, precision is 99.34%, recall rate is 99.33%, and F1 score is 99.34% on multi-label class. For the binary classification model, we got 99.41% of accuracy.</p></div> |
| eu_rights_str_mv | openAccess |
| id | Manara_523bfd1fcd53f7436cde416378ef2999 |
| identifier_str_mv | 10.1371/journal.pone.0308862.t002 |
| network_acronym_str | Manara |
| network_name_str | ManaraRepo |
| oai_identifier_str | oai:figshare.com:article/27162637 |
| publishDate | 2024 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Numeric features of the dataset.Khandaker Mohammad Mohi Uddin (13124356)Hasibul Hamim (19797121)Mst. Nishat Tasnim Mim (19797124)Arnisha Akhter (16323159)Md Ashraf Uddin (14855902)Biological Sciences not elsewhere classifiedInformation Systems not elsewhere classifiedusing mathematical visualizationstochastic gradient descentsocial media platformssafeguard psychological wellnessreverse document frequencyonline bullying couldnatural language processingfree online environmenteliminate online harassmentcombined two setsbecome much easiercnn ), convolutionaldeep neural networksbinary classification modelcategorize bengali comments000 bengali commentsbidirectional long short34 %, precisionmlp ), kbengali commentslong shortlstm ),hybrid modeldeep learning33 %,xlink ">writing aimsterm memoryreaching consequencesrapid adoptionproducing 94nearest neighborsmachine learninglayer perceptronlarge numberlarge amountlabel classf1 scoreevaluation accuracyearlier stagedifferent pointsdetected quicklycount vectorizercontemporary webbased approach<div><p>Through the advancement of the contemporary web and the rapid adoption of social media platforms such as YouTube, Twitter, and Facebook, for example, life has become much easier when dealing with certain highly personal problems. The far-reaching consequences of online harassment require immediate preventative steps to safeguard psychological wellness and scholarly achievement via detection at an earlier stage. This piece of writing aims to eliminate online harassment and create a criticism-free online environment. In the paper, we have used a variety of attributes to evaluate a large number of Bengali comments. We communicate cleansed data utilizing machine learning (ML) methods and natural language processing techniques, which must be followed using term frequency and reverse document frequency (TF-IDF) with a count vectorizer. In addition, we used tokenization with padding to feed our deep learning (DL) models. Using mathematical visualization and natural language processing, online bullying could be detected quickly. Multi-layer Perceptron (MLP), K-Nearest Neighbors (K-NN), Extreme Gradient Boosting (XGBoost), Adaptive Boosting Classifier (AdaBoost), Logistic Regression Classifier (LR), Random Forest Classifier (RF), Bagging Classifier, Stochastic Gradient Descent (SGD), Voting Classifier, and Stacking are employed in the research we conducted. We expanded our investigation to include different DL frameworks. Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Convolutional-Long Short-Term Memory (C-LSTM), and Bidirectional Long Short-Term Memory (BiLSTM) are all implemented. A large amount of data is required to precisely recognize harassing behavior. To rapidly recognize internet harassment written material, we combined two sets of data, producing 94,000 Bengali comments from different points of view. After understanding the ML and DL models, we can see that a hybrid model (MLP+SGD+LR) performed more effectively when compared to other models, its evaluation accuracy is 99.34%, precision is 99.34%, recall rate is 99.33%, and F1 score is 99.34% on multi-label class. For the binary classification model, we got 99.41% of accuracy.</p></div>2024-10-03T17:26:54ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.1371/journal.pone.0308862.t002https://figshare.com/articles/dataset/Numeric_features_of_the_dataset_/27162637CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/271626372024-10-03T17:26:54Z |
| spellingShingle | Numeric features of the dataset. Khandaker Mohammad Mohi Uddin (13124356) Biological Sciences not elsewhere classified Information Systems not elsewhere classified using mathematical visualization stochastic gradient descent social media platforms safeguard psychological wellness reverse document frequency online bullying could natural language processing free online environment eliminate online harassment combined two sets become much easier cnn ), convolutional deep neural networks binary classification model categorize bengali comments 000 bengali comments bidirectional long short 34 %, precision mlp ), k bengali comments long short lstm ), hybrid model deep learning 33 %, xlink "> writing aims term memory reaching consequences rapid adoption producing 94 nearest neighbors machine learning layer perceptron large number large amount label class f1 score evaluation accuracy earlier stage different points detected quickly count vectorizer contemporary web based approach |
| status_str | publishedVersion |
| title | Numeric features of the dataset. |
| title_full | Numeric features of the dataset. |
| title_fullStr | Numeric features of the dataset. |
| title_full_unstemmed | Numeric features of the dataset. |
| title_short | Numeric features of the dataset. |
| title_sort | Numeric features of the dataset. |
| topic | Biological Sciences not elsewhere classified Information Systems not elsewhere classified using mathematical visualization stochastic gradient descent social media platforms safeguard psychological wellness reverse document frequency online bullying could natural language processing free online environment eliminate online harassment combined two sets become much easier cnn ), convolutional deep neural networks binary classification model categorize bengali comments 000 bengali comments bidirectional long short 34 %, precision mlp ), k bengali comments long short lstm ), hybrid model deep learning 33 %, xlink "> writing aims term memory reaching consequences rapid adoption producing 94 nearest neighbors machine learning layer perceptron large number large amount label class f1 score evaluation accuracy earlier stage different points detected quickly count vectorizer contemporary web based approach |