The detection performance for different dataset.

<div><p>The detection of defects on steel surfaces constitutes a vital area of research in computer vision, characterized by its complexity and variety, which pose significant difficulties for accurate identification. In this context, we introduce a deep learning framework that combines...

Full description

Saved in:
Bibliographic Details
Main Author: Shouluan Wu (22601074) (author)
Other Authors: Hui Yang (91136) (author), Liefa Liao (12098633) (author), Chao Song (379006) (author), Yating Fang (5483327) (author), Yang Yang (45629) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<div><p>The detection of defects on steel surfaces constitutes a vital area of research in computer vision, characterized by its complexity and variety, which pose significant difficulties for accurate identification. In this context, we introduce a deep learning framework that combines multi-channel random coding with modules for multi-scale feature fusion to tackle the challenges of low recognition accuracy and insufficient classification power prevalent in conventional models. Our model capitalizes on the self-attention mechanism associated with the Transformer architecture, alongside the strong feature extraction capabilities of Convolutional Neural Networks (CNNs), to facilitate a combined improvement in performance. To start, we enhance the model’s feature extraction functionality by incorporating ResNet18 along with global self-attention. Next, we bring forth a novel improvement to the backbone network by adding a multi-channel shuffled encoding module, which effectively encodes various features through the interactions of different convolutional groups, thus minimizing the number of parameters. Additionally, we introduce a multi-feature fusion module UPC-SimAM (upsample concatenated Simple Parameter-Free Attention Module), which is free from parameter redundancy to bolster the model’s capacity to merge multi-scale features. Our experiments on the NEU-DET and GC10-DE datasets demonstrate that our model outperforms existing state-of-the-art techniques regarding detection efficiency. Specifically, the model registers a classification accuracy of 91.72%, an mAP@0.5 of 83.03%, and an mAP@0.5:0.95 of 45.55% on the NEU-DET dataset. On the GC10-DE dataset, it achieves a classification precision of 76.73%, an mAP@0.5 of 65.03%, and an mAP@0.5:0.95 of 32.46%. Through detailed ablation studies and visualization experiments, we affirm the considerable potential and benefits of the proposed SH-DETR model in the field of detecting defects on steel surfaces.</p></div>