Numeric features of the dataset.

<div><p>Through the advancement of the contemporary web and the rapid adoption of social media platforms such as YouTube, Twitter, and Facebook, for example, life has become much easier when dealing with certain highly personal problems. The far-reaching consequences of online harassment...

Full description

Saved in:
Bibliographic Details
Main Author: Khandaker Mohammad Mohi Uddin (13124356) (author)
Other Authors: Hasibul Hamim (19797121) (author), Mst. Nishat Tasnim Mim (19797124) (author), Arnisha Akhter (16323159) (author), Md Ashraf Uddin (14855902) (author)
Published: 2024
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1852026183691010048
author Khandaker Mohammad Mohi Uddin (13124356)
author2 Hasibul Hamim (19797121)
Mst. Nishat Tasnim Mim (19797124)
Arnisha Akhter (16323159)
Md Ashraf Uddin (14855902)
author2_role author
author
author
author
author_facet Khandaker Mohammad Mohi Uddin (13124356)
Hasibul Hamim (19797121)
Mst. Nishat Tasnim Mim (19797124)
Arnisha Akhter (16323159)
Md Ashraf Uddin (14855902)
author_role author
dc.creator.none.fl_str_mv Khandaker Mohammad Mohi Uddin (13124356)
Hasibul Hamim (19797121)
Mst. Nishat Tasnim Mim (19797124)
Arnisha Akhter (16323159)
Md Ashraf Uddin (14855902)
dc.date.none.fl_str_mv 2024-10-03T17:26:54Z
dc.identifier.none.fl_str_mv 10.1371/journal.pone.0308862.t002
dc.relation.none.fl_str_mv https://figshare.com/articles/dataset/Numeric_features_of_the_dataset_/27162637
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Biological Sciences not elsewhere classified
Information Systems not elsewhere classified
using mathematical visualization
stochastic gradient descent
social media platforms
safeguard psychological wellness
reverse document frequency
online bullying could
natural language processing
free online environment
eliminate online harassment
combined two sets
become much easier
cnn ), convolutional
deep neural networks
binary classification model
categorize bengali comments
000 bengali comments
bidirectional long short
34 %, precision
mlp ), k
bengali comments
long short
lstm ),
hybrid model
deep learning
33 %,
xlink ">
writing aims
term memory
reaching consequences
rapid adoption
producing 94
nearest neighbors
machine learning
layer perceptron
large number
large amount
label class
f1 score
evaluation accuracy
earlier stage
different points
detected quickly
count vectorizer
contemporary web
based approach
dc.title.none.fl_str_mv Numeric features of the dataset.
dc.type.none.fl_str_mv Dataset
info:eu-repo/semantics/publishedVersion
dataset
description <div><p>Through the advancement of the contemporary web and the rapid adoption of social media platforms such as YouTube, Twitter, and Facebook, for example, life has become much easier when dealing with certain highly personal problems. The far-reaching consequences of online harassment require immediate preventative steps to safeguard psychological wellness and scholarly achievement via detection at an earlier stage. This piece of writing aims to eliminate online harassment and create a criticism-free online environment. In the paper, we have used a variety of attributes to evaluate a large number of Bengali comments. We communicate cleansed data utilizing machine learning (ML) methods and natural language processing techniques, which must be followed using term frequency and reverse document frequency (TF-IDF) with a count vectorizer. In addition, we used tokenization with padding to feed our deep learning (DL) models. Using mathematical visualization and natural language processing, online bullying could be detected quickly. Multi-layer Perceptron (MLP), K-Nearest Neighbors (K-NN), Extreme Gradient Boosting (XGBoost), Adaptive Boosting Classifier (AdaBoost), Logistic Regression Classifier (LR), Random Forest Classifier (RF), Bagging Classifier, Stochastic Gradient Descent (SGD), Voting Classifier, and Stacking are employed in the research we conducted. We expanded our investigation to include different DL frameworks. Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Convolutional-Long Short-Term Memory (C-LSTM), and Bidirectional Long Short-Term Memory (BiLSTM) are all implemented. A large amount of data is required to precisely recognize harassing behavior. To rapidly recognize internet harassment written material, we combined two sets of data, producing 94,000 Bengali comments from different points of view. After understanding the ML and DL models, we can see that a hybrid model (MLP+SGD+LR) performed more effectively when compared to other models, its evaluation accuracy is 99.34%, precision is 99.34%, recall rate is 99.33%, and F1 score is 99.34% on multi-label class. For the binary classification model, we got 99.41% of accuracy.</p></div>
eu_rights_str_mv openAccess
id Manara_523bfd1fcd53f7436cde416378ef2999
identifier_str_mv 10.1371/journal.pone.0308862.t002
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/27162637
publishDate 2024
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Numeric features of the dataset.Khandaker Mohammad Mohi Uddin (13124356)Hasibul Hamim (19797121)Mst. Nishat Tasnim Mim (19797124)Arnisha Akhter (16323159)Md Ashraf Uddin (14855902)Biological Sciences not elsewhere classifiedInformation Systems not elsewhere classifiedusing mathematical visualizationstochastic gradient descentsocial media platformssafeguard psychological wellnessreverse document frequencyonline bullying couldnatural language processingfree online environmenteliminate online harassmentcombined two setsbecome much easiercnn ), convolutionaldeep neural networksbinary classification modelcategorize bengali comments000 bengali commentsbidirectional long short34 %, precisionmlp ), kbengali commentslong shortlstm ),hybrid modeldeep learning33 %,xlink ">writing aimsterm memoryreaching consequencesrapid adoptionproducing 94nearest neighborsmachine learninglayer perceptronlarge numberlarge amountlabel classf1 scoreevaluation accuracyearlier stagedifferent pointsdetected quicklycount vectorizercontemporary webbased approach<div><p>Through the advancement of the contemporary web and the rapid adoption of social media platforms such as YouTube, Twitter, and Facebook, for example, life has become much easier when dealing with certain highly personal problems. The far-reaching consequences of online harassment require immediate preventative steps to safeguard psychological wellness and scholarly achievement via detection at an earlier stage. This piece of writing aims to eliminate online harassment and create a criticism-free online environment. In the paper, we have used a variety of attributes to evaluate a large number of Bengali comments. We communicate cleansed data utilizing machine learning (ML) methods and natural language processing techniques, which must be followed using term frequency and reverse document frequency (TF-IDF) with a count vectorizer. In addition, we used tokenization with padding to feed our deep learning (DL) models. Using mathematical visualization and natural language processing, online bullying could be detected quickly. Multi-layer Perceptron (MLP), K-Nearest Neighbors (K-NN), Extreme Gradient Boosting (XGBoost), Adaptive Boosting Classifier (AdaBoost), Logistic Regression Classifier (LR), Random Forest Classifier (RF), Bagging Classifier, Stochastic Gradient Descent (SGD), Voting Classifier, and Stacking are employed in the research we conducted. We expanded our investigation to include different DL frameworks. Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Convolutional-Long Short-Term Memory (C-LSTM), and Bidirectional Long Short-Term Memory (BiLSTM) are all implemented. A large amount of data is required to precisely recognize harassing behavior. To rapidly recognize internet harassment written material, we combined two sets of data, producing 94,000 Bengali comments from different points of view. After understanding the ML and DL models, we can see that a hybrid model (MLP+SGD+LR) performed more effectively when compared to other models, its evaluation accuracy is 99.34%, precision is 99.34%, recall rate is 99.33%, and F1 score is 99.34% on multi-label class. For the binary classification model, we got 99.41% of accuracy.</p></div>2024-10-03T17:26:54ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.1371/journal.pone.0308862.t002https://figshare.com/articles/dataset/Numeric_features_of_the_dataset_/27162637CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/271626372024-10-03T17:26:54Z
spellingShingle Numeric features of the dataset.
Khandaker Mohammad Mohi Uddin (13124356)
Biological Sciences not elsewhere classified
Information Systems not elsewhere classified
using mathematical visualization
stochastic gradient descent
social media platforms
safeguard psychological wellness
reverse document frequency
online bullying could
natural language processing
free online environment
eliminate online harassment
combined two sets
become much easier
cnn ), convolutional
deep neural networks
binary classification model
categorize bengali comments
000 bengali comments
bidirectional long short
34 %, precision
mlp ), k
bengali comments
long short
lstm ),
hybrid model
deep learning
33 %,
xlink ">
writing aims
term memory
reaching consequences
rapid adoption
producing 94
nearest neighbors
machine learning
layer perceptron
large number
large amount
label class
f1 score
evaluation accuracy
earlier stage
different points
detected quickly
count vectorizer
contemporary web
based approach
status_str publishedVersion
title Numeric features of the dataset.
title_full Numeric features of the dataset.
title_fullStr Numeric features of the dataset.
title_full_unstemmed Numeric features of the dataset.
title_short Numeric features of the dataset.
title_sort Numeric features of the dataset.
topic Biological Sciences not elsewhere classified
Information Systems not elsewhere classified
using mathematical visualization
stochastic gradient descent
social media platforms
safeguard psychological wellness
reverse document frequency
online bullying could
natural language processing
free online environment
eliminate online harassment
combined two sets
become much easier
cnn ), convolutional
deep neural networks
binary classification model
categorize bengali comments
000 bengali comments
bidirectional long short
34 %, precision
mlp ), k
bengali comments
long short
lstm ),
hybrid model
deep learning
33 %,
xlink ">
writing aims
term memory
reaching consequences
rapid adoption
producing 94
nearest neighbors
machine learning
layer perceptron
large number
large amount
label class
f1 score
evaluation accuracy
earlier stage
different points
detected quickly
count vectorizer
contemporary web
based approach