Emerging Research Topic Detection Using Filtered-LDA

Comparing two sets of documents to identify new topics is useful in many applications, like discovering trending topics from sets of scientific papers, emerging topic detection in microblogs, and interpreting sentiment variations in Twitter. In this paper, the main topic-modeling-based approaches to...

Full description

Saved in:
Bibliographic Details
Main Author: Alattar, Fuad (author)
Other Authors: Shaalan, Khaled (author)
Published: 2021
Subjects:
Online Access:https://bspace.buid.ac.ae/handle/1234/2987
https://doi.org/10.3390/ai2040035.
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Comparing two sets of documents to identify new topics is useful in many applications, like discovering trending topics from sets of scientific papers, emerging topic detection in microblogs, and interpreting sentiment variations in Twitter. In this paper, the main topic-modeling-based approaches to address this task are examined to identify limitations and necessary enhancements. To overcome these limitations, we introduce two separate frameworks to discover emerging topics through a filtered latent Dirichlet allocation (filtered-LDA) model. The model acts as a filter that identifies old topics from a timestamped set of documents, removes all documents that focus on old topics, and keeps documents that discuss new topics. Filtered-LDA also genuinely reduces the chance of using keywords from old topics to represent emerging topics. The final stage of the filter uses multiple topic visualization formats to improve human interpretability of the filtered topics, and it presents the most-representative document for each topic.