The confusion matrix.

<div><p>With the widespread adoption of internet technologies and email communication systems, the exponential growth in email usage has precipitated a corresponding surge in spam proliferation. These unsolicited messages not only consume users’ valuable time through information overload...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Ye Tian (220278) (author)
مؤلفون آخرون: Xin Dai (152712) (author), Zhijun Li (475291) (author), Hong Guo (142424) (author), Xiao Mao (10538435) (author)
منشور في: 2025
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:<div><p>With the widespread adoption of internet technologies and email communication systems, the exponential growth in email usage has precipitated a corresponding surge in spam proliferation. These unsolicited messages not only consume users’ valuable time through information overload but also pose significant cybersecurity threats through malware distribution and phishing schemes, thereby jeopardizing both digital security and user experience. This emerging challenge underscores the critical importance of developing effective spam detection mechanisms as a cornerstone of modern cybersecurity infrastructure. Through empirical analysis of machine learning (ML) performance on publicly available spam datasets, we established that algorithmic ensemble methods consistently outperform individual models in detection accuracy. We propose an optimized stacking ensemble framework that strategically combines predictions from four heterogeneous base models (NBC, k-NN, LR, XGBoost) through meta-learner integration. Our methodology incorporates grid search cross-validation with hyperparameter space optimization, enabling systematic identification of parameter configurations that maximize detection performance. The enhanced model was rigorously evaluated using comprehensive metrics including accuracy (99.79%), precision, recall, and F1-score, demonstrating statistically significant improvements over both baseline models and existing solutions documented in the literature.</p></div>