Fast Text Classification using Lean Gradient Descent Feed Forward Neural Network for Category Feature Augmentation

Text classification is a key task of the Natural Language Processing (NLP) field that aims at assigning predefined categories to textual documents. Performing text classification requires features that effectively represent the content and the meaning of textual documents. Selecting a suitable metho...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Attieh, Joseph (author)
مؤلفون آخرون: Tekli, Joe (author)
التنسيق: conferenceObject
منشور في: 2024
الوصول للمادة أونلاين:http://hdl.handle.net/10725/16295
https://doi.org/10.1109/TrustCom60117.2023.00330
http://libraries.lau.edu.lb/research/laur/terms-of-use/articles.php
https://ieeexplore.ieee.org/abstract/document/10538758
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:Text classification is a key task of the Natural Language Processing (NLP) field that aims at assigning predefined categories to textual documents. Performing text classification requires features that effectively represent the content and the meaning of textual documents. Selecting a suitable method for term weighting is of central importance and can improve the quality of the classification method. In this paper, we propose to a new text classification solution to perform Category-based Feature Augmentation (CFA) on the document representation. First, a term-category feature matrix is derived from a modified version of the supervised Term-Frequency Inverse-Category-Frequency (TF-ICF) weighting model. This is done by embedding the TF-ICF matrix in a one-layer feed-forward neural network. The latter is trained using the gradient descent algorithm allowing to iteratively update the term-category matrix until reaching convergence. The model produces category-based feature vector representations that are used to augment the document representations and perform the classification task. Experimental results on four benchmark datasets show that our lean model approach improves text classification accuracy and is significantly more efficient compared with its deep model alternatives.