Clustering Tweets to Discover Trending Topics about دبي (Dubai)

Nowadays, a lot of people targeting social networks to learn what are the trending topics and the news alongside the huge flow of texts posted daily in social networks. One of these social networks is Twitter - a microblogging hub and rich environment of data. Scanning tweets online is a hard task a...

Full description

Saved in:
Bibliographic Details
Main Author: ALYALYALI, SALAMA KHAMIS SALEM KHAMIS (author)
Published: 2018
Subjects:
Online Access:http://bspace.buid.ac.ae/handle/1234/1196
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1862980611249012736
author ALYALYALI, SALAMA KHAMIS SALEM KHAMIS
author_facet ALYALYALI, SALAMA KHAMIS SALEM KHAMIS
author_role author
dc.creator.none.fl_str_mv ALYALYALI, SALAMA KHAMIS SALEM KHAMIS
dc.date.none.fl_str_mv 2018-09-11T10:17:03Z
2018-09-11T10:17:03Z
2018-03
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv 2015228094
http://bspace.buid.ac.ae/handle/1234/1196
dc.language.none.fl_str_mv en
dc.publisher.none.fl_str_mv The British University in Dubai (BUiD)
dc.subject.none.fl_str_mv Twitter
Arabic tweets
K-mean clustering
TF-IDF
cosine similarity
dc.title.none.fl_str_mv Clustering Tweets to Discover Trending Topics about دبي (Dubai)
dc.type.none.fl_str_mv Dissertation
description Nowadays, a lot of people targeting social networks to learn what are the trending topics and the news alongside the huge flow of texts posted daily in social networks. One of these social networks is Twitter - a microblogging hub and rich environment of data. Scanning tweets online is a hard task and searching effortlessly to find intended topic from huge amount of data is also time consuming. This paper is intended to propose a solution of collecting Twitter of the corpus دبي (Dubai) by using Zapier website and storing them in Google sheet. Then, creating a word vector to the tweets by using TF-IDF methodology. After this, log results into k- mean clustering algorithm with cosine similarity to measure similarity between objects of each cluster. The results demonstrate that internal evaluation techniques failed to evaluate quality of the cluster. In addition to that, interesting topics was found about دبي (Dubai). Moreover, better results achieved by using Filter Tokens (by Region) than without using it. The data were collected for the experiment at several periods to ensure getting the most trending topics about دبي (Dubai). All of the results found in this paper tested with real tweets.
id budr_4b6b620f54225c51d085e1e0bc7c6e41
identifier_str_mv 2015228094
language_invalid_str_mv en
network_acronym_str budr
network_name_str The British University in Dubai repository
oai_identifier_str oai:bspace.buid.ac.ae:1234/1196
publishDate 2018
publisher.none.fl_str_mv The British University in Dubai (BUiD)
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling Clustering Tweets to Discover Trending Topics about دبي (Dubai)ALYALYALI, SALAMA KHAMIS SALEM KHAMISTwitterArabic tweetsK-mean clusteringTF-IDFcosine similarityNowadays, a lot of people targeting social networks to learn what are the trending topics and the news alongside the huge flow of texts posted daily in social networks. One of these social networks is Twitter - a microblogging hub and rich environment of data. Scanning tweets online is a hard task and searching effortlessly to find intended topic from huge amount of data is also time consuming. This paper is intended to propose a solution of collecting Twitter of the corpus دبي (Dubai) by using Zapier website and storing them in Google sheet. Then, creating a word vector to the tweets by using TF-IDF methodology. After this, log results into k- mean clustering algorithm with cosine similarity to measure similarity between objects of each cluster. The results demonstrate that internal evaluation techniques failed to evaluate quality of the cluster. In addition to that, interesting topics was found about دبي (Dubai). Moreover, better results achieved by using Filter Tokens (by Region) than without using it. The data were collected for the experiment at several periods to ensure getting the most trending topics about دبي (Dubai). All of the results found in this paper tested with real tweets.The British University in Dubai (BUiD)2018-09-11T10:17:03Z2018-09-11T10:17:03Z2018-03Dissertationapplication/pdf2015228094http://bspace.buid.ac.ae/handle/1234/1196enoai:bspace.buid.ac.ae:1234/11962021-10-18T07:07:16Z
spellingShingle Clustering Tweets to Discover Trending Topics about دبي (Dubai)
ALYALYALI, SALAMA KHAMIS SALEM KHAMIS
Twitter
Arabic tweets
K-mean clustering
TF-IDF
cosine similarity
title Clustering Tweets to Discover Trending Topics about دبي (Dubai)
title_full Clustering Tweets to Discover Trending Topics about دبي (Dubai)
title_fullStr Clustering Tweets to Discover Trending Topics about دبي (Dubai)
title_full_unstemmed Clustering Tweets to Discover Trending Topics about دبي (Dubai)
title_short Clustering Tweets to Discover Trending Topics about دبي (Dubai)
title_sort Clustering Tweets to Discover Trending Topics about دبي (Dubai)
topic Twitter
Arabic tweets
K-mean clustering
TF-IDF
cosine similarity
url http://bspace.buid.ac.ae/handle/1234/1196