Text-based framework for spam detection in Twitter. (c2017)

Due to the inevitable popularity of twitter, as well as its ability to transport messages into sparse communities, spammers tend to take twitter for granted in spreading their commercial messages. Moreover, different spammers behave in various manners. Some of them adopted behavioral approaches; oth...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Halawi, Bahia M. (author)
التنسيق:	masterThesis
منشور في:	2017
الموضوعات:	Lebanese American University > Dissertations Dissertations, Academic Spam filtering (Electronic mail) Twitter Ontologies (Information retrieval) Spam (Electronic mail)
الوصول للمادة أونلاين:	http://hdl.handle.net/10725/6553 https://doi.org/10.26756/th.2017.21 http://libraries.lau.edu.lb/research/laur/terms-of-use/thesis.php
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

_version_	1864513480222048256
author	Halawi, Bahia M.
author_facet	Halawi, Bahia M.
author_role	author
dc.creator.none.fl_str_mv	Halawi, Bahia M.
dc.date.none.fl_str_mv	2017-11-08T10:20:12Z 2017-11-08T10:20:12Z 2017 2017-11-08 2017-05-11
dc.identifier.none.fl_str_mv	http://hdl.handle.net/10725/6553 https://doi.org/10.26756/th.2017.21 http://libraries.lau.edu.lb/research/laur/terms-of-use/thesis.php
dc.language.none.fl_str_mv	en
dc.publisher.none.fl_str_mv	Lebanese American University
dc.rights.*.fl_str_mv	info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv	Lebanese American University -- Dissertations Dissertations, Academic Spam filtering (Electronic mail) Twitter Ontologies (Information retrieval) Spam (Electronic mail)
dc.title.none.fl_str_mv	Text-based framework for spam detection in Twitter. (c2017)
dc.type.none.fl_str_mv	Thesis info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/masterThesis
description	Due to the inevitable popularity of twitter, as well as its ability to transport messages into sparse communities, spammers tend to take twitter for granted in spreading their commercial messages. Moreover, different spammers behave in various manners. Some of them adopted behavioral approaches; others made use of content entropy while many others explored bait behaviors. Previous related works look at this problem from the perspective of studying a tweet along with its metadata, performing different statistical and profiling activities in order to infer about spam. However, these approaches do not pay attention to the limitations placed over twitter’s streaming API, minimizing user’s abilities to extracting follower and followees’ data. Also, many of the approaches violate user privacy by investigating personal data about him/her without previous consent. This thesis is dedicated to studying the relationship between tweets shared by different users, particularly, content considered as spam vs. legitimate. Moreover, we will overcome the above mentioned limitations by developing a set of Message to Message analysis approaches. First, we will deploy the cosine vector similarity and later the natural language toolkit and co-occurrence model to enhance the correctness in detection. However, due to spammer’s creativity in building organic messages, hardly looking similar to old messages, these models suffer from limitations. That is why, we elaborate the use of ontologies in detecting spam over twitter during events. Our experimental results will demonstrate the efficiency of analyzing spam content/semantic relationships over twitter through ontologies.
eu_rights_str_mv	openAccess
format	masterThesis
id	LAURepo_d36bf4c944f269c135ca55708e89587e
language_invalid_str_mv	en
network_acronym_str	LAURepo
network_name_str	Lebanese American University repository
oai_identifier_str	oai:laur.lau.edu.lb:10725/6553
publishDate	2017
publisher.none.fl_str_mv	Lebanese American University
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling	Text-based framework for spam detection in Twitter. (c2017)Halawi, Bahia M.Lebanese American University -- DissertationsDissertations, AcademicSpam filtering (Electronic mail)TwitterOntologies (Information retrieval)Spam (Electronic mail)Due to the inevitable popularity of twitter, as well as its ability to transport messages into sparse communities, spammers tend to take twitter for granted in spreading their commercial messages. Moreover, different spammers behave in various manners. Some of them adopted behavioral approaches; others made use of content entropy while many others explored bait behaviors. Previous related works look at this problem from the perspective of studying a tweet along with its metadata, performing different statistical and profiling activities in order to infer about spam. However, these approaches do not pay attention to the limitations placed over twitter’s streaming API, minimizing user’s abilities to extracting follower and followees’ data. Also, many of the approaches violate user privacy by investigating personal data about him/her without previous consent. This thesis is dedicated to studying the relationship between tweets shared by different users, particularly, content considered as spam vs. legitimate. Moreover, we will overcome the above mentioned limitations by developing a set of Message to Message analysis approaches. First, we will deploy the cosine vector similarity and later the natural language toolkit and co-occurrence model to enhance the correctness in detection. However, due to spammer’s creativity in building organic messages, hardly looking similar to old messages, these models suffer from limitations. That is why, we elaborate the use of ontologies in detecting spam over twitter during events. Our experimental results will demonstrate the efficiency of analyzing spam content/semantic relationships over twitter through ontologies.N/A1 hard copy: xii, 78 leaves; 30 cm. available at RNL.Bibliography : leaves 75-78.Lebanese American University2017-11-08T10:20:12Z2017-11-08T10:20:12Z20172017-11-082017-05-11Thesisinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesishttp://hdl.handle.net/10725/6553https://doi.org/10.26756/th.2017.21http://libraries.lau.edu.lb/research/laur/terms-of-use/thesis.phpeninfo:eu-repo/semantics/openAccessoai:laur.lau.edu.lb:10725/65532021-03-19T10:03:27Z
spellingShingle	Text-based framework for spam detection in Twitter. (c2017) Halawi, Bahia M. Lebanese American University -- Dissertations Dissertations, Academic Spam filtering (Electronic mail) Twitter Ontologies (Information retrieval) Spam (Electronic mail)
status_str	publishedVersion
title	Text-based framework for spam detection in Twitter. (c2017)
title_full	Text-based framework for spam detection in Twitter. (c2017)
title_fullStr	Text-based framework for spam detection in Twitter. (c2017)
title_full_unstemmed	Text-based framework for spam detection in Twitter. (c2017)
title_short	Text-based framework for spam detection in Twitter. (c2017)
title_sort	Text-based framework for spam detection in Twitter. (c2017)
topic	Lebanese American University -- Dissertations Dissertations, Academic Spam filtering (Electronic mail) Twitter Ontologies (Information retrieval) Spam (Electronic mail)
url	http://hdl.handle.net/10725/6553 https://doi.org/10.26756/th.2017.21 http://libraries.lau.edu.lb/research/laur/terms-of-use/thesis.php

Text-based framework for spam detection in Twitter. (c2017)

مواد مشابهة