The detection performance for different dataset.

<div><p>The detection of defects on steel surfaces constitutes a vital area of research in computer vision, characterized by its complexity and variety, which pose significant difficulties for accurate identification. In this context, we introduce a deep learning framework that combines...

Full description

Saved in:
Bibliographic Details
Main Author: Shouluan Wu (22601074) (author)
Other Authors: Hui Yang (91136) (author), Liefa Liao (12098633) (author), Chao Song (379006) (author), Yating Fang (5483327) (author), Yang Yang (45629) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1852014954633232384
author Shouluan Wu (22601074)
author2 Hui Yang (91136)
Liefa Liao (12098633)
Chao Song (379006)
Yating Fang (5483327)
Yang Yang (45629)
author2_role author
author
author
author
author
author_facet Shouluan Wu (22601074)
Hui Yang (91136)
Liefa Liao (12098633)
Chao Song (379006)
Yating Fang (5483327)
Yang Yang (45629)
author_role author
dc.creator.none.fl_str_mv Shouluan Wu (22601074)
Hui Yang (91136)
Liefa Liao (12098633)
Chao Song (379006)
Yating Fang (5483327)
Yang Yang (45629)
dc.date.none.fl_str_mv 2025-11-11T18:37:14Z
dc.identifier.none.fl_str_mv 10.1371/journal.pone.0334048.t003
dc.relation.none.fl_str_mv https://figshare.com/articles/dataset/The_detection_performance_for_different_dataset_/30592498
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Medicine
Neuroscience
Cancer
Space Science
Biological Sciences not elsewhere classified
Information Systems not elsewhere classified
steel surfaces constitutes
pose significant difficulties
incorporating resnet18 along
feature extraction functionality
different convolutional groups
detailed ablation studies
deep learning framework
convolutional neural networks
channel random coding
scale feature fusion
low recognition accuracy
de datasets demonstrate
model &# 8217
attention mechanism associated
scale features
de dataset
model registers
model capitalizes
classification accuracy
xlink ">
vital area
transformer architecture
thus minimizing
parameter redundancy
novel improvement
conventional models
considerable potential
computer vision
combined improvement
cnns ),
bring forth
backbone network
accurate identification
73 %,
72 %,
46 %.
03 %,
dc.title.none.fl_str_mv The detection performance for different dataset.
dc.type.none.fl_str_mv Dataset
info:eu-repo/semantics/publishedVersion
dataset
description <div><p>The detection of defects on steel surfaces constitutes a vital area of research in computer vision, characterized by its complexity and variety, which pose significant difficulties for accurate identification. In this context, we introduce a deep learning framework that combines multi-channel random coding with modules for multi-scale feature fusion to tackle the challenges of low recognition accuracy and insufficient classification power prevalent in conventional models. Our model capitalizes on the self-attention mechanism associated with the Transformer architecture, alongside the strong feature extraction capabilities of Convolutional Neural Networks (CNNs), to facilitate a combined improvement in performance. To start, we enhance the model’s feature extraction functionality by incorporating ResNet18 along with global self-attention. Next, we bring forth a novel improvement to the backbone network by adding a multi-channel shuffled encoding module, which effectively encodes various features through the interactions of different convolutional groups, thus minimizing the number of parameters. Additionally, we introduce a multi-feature fusion module UPC-SimAM (upsample concatenated Simple Parameter-Free Attention Module), which is free from parameter redundancy to bolster the model’s capacity to merge multi-scale features. Our experiments on the NEU-DET and GC10-DE datasets demonstrate that our model outperforms existing state-of-the-art techniques regarding detection efficiency. Specifically, the model registers a classification accuracy of 91.72%, an mAP@0.5 of 83.03%, and an mAP@0.5:0.95 of 45.55% on the NEU-DET dataset. On the GC10-DE dataset, it achieves a classification precision of 76.73%, an mAP@0.5 of 65.03%, and an mAP@0.5:0.95 of 32.46%. Through detailed ablation studies and visualization experiments, we affirm the considerable potential and benefits of the proposed SH-DETR model in the field of detecting defects on steel surfaces.</p></div>
eu_rights_str_mv openAccess
id Manara_efa588e99bf862138db6cfe29fbdfd52
identifier_str_mv 10.1371/journal.pone.0334048.t003
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/30592498
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling The detection performance for different dataset.Shouluan Wu (22601074)Hui Yang (91136)Liefa Liao (12098633)Chao Song (379006)Yating Fang (5483327)Yang Yang (45629)MedicineNeuroscienceCancerSpace ScienceBiological Sciences not elsewhere classifiedInformation Systems not elsewhere classifiedsteel surfaces constitutespose significant difficultiesincorporating resnet18 alongfeature extraction functionalitydifferent convolutional groupsdetailed ablation studiesdeep learning frameworkconvolutional neural networkschannel random codingscale feature fusionlow recognition accuracyde datasets demonstratemodel &# 8217attention mechanism associatedscale featuresde datasetmodel registersmodel capitalizesclassification accuracyxlink ">vital areatransformer architecturethus minimizingparameter redundancynovel improvementconventional modelsconsiderable potentialcomputer visioncombined improvementcnns ),bring forthbackbone networkaccurate identification73 %,72 %,46 %.03 %,<div><p>The detection of defects on steel surfaces constitutes a vital area of research in computer vision, characterized by its complexity and variety, which pose significant difficulties for accurate identification. In this context, we introduce a deep learning framework that combines multi-channel random coding with modules for multi-scale feature fusion to tackle the challenges of low recognition accuracy and insufficient classification power prevalent in conventional models. Our model capitalizes on the self-attention mechanism associated with the Transformer architecture, alongside the strong feature extraction capabilities of Convolutional Neural Networks (CNNs), to facilitate a combined improvement in performance. To start, we enhance the model’s feature extraction functionality by incorporating ResNet18 along with global self-attention. Next, we bring forth a novel improvement to the backbone network by adding a multi-channel shuffled encoding module, which effectively encodes various features through the interactions of different convolutional groups, thus minimizing the number of parameters. Additionally, we introduce a multi-feature fusion module UPC-SimAM (upsample concatenated Simple Parameter-Free Attention Module), which is free from parameter redundancy to bolster the model’s capacity to merge multi-scale features. Our experiments on the NEU-DET and GC10-DE datasets demonstrate that our model outperforms existing state-of-the-art techniques regarding detection efficiency. Specifically, the model registers a classification accuracy of 91.72%, an mAP@0.5 of 83.03%, and an mAP@0.5:0.95 of 45.55% on the NEU-DET dataset. On the GC10-DE dataset, it achieves a classification precision of 76.73%, an mAP@0.5 of 65.03%, and an mAP@0.5:0.95 of 32.46%. Through detailed ablation studies and visualization experiments, we affirm the considerable potential and benefits of the proposed SH-DETR model in the field of detecting defects on steel surfaces.</p></div>2025-11-11T18:37:14ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.1371/journal.pone.0334048.t003https://figshare.com/articles/dataset/The_detection_performance_for_different_dataset_/30592498CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/305924982025-11-11T18:37:14Z
spellingShingle The detection performance for different dataset.
Shouluan Wu (22601074)
Medicine
Neuroscience
Cancer
Space Science
Biological Sciences not elsewhere classified
Information Systems not elsewhere classified
steel surfaces constitutes
pose significant difficulties
incorporating resnet18 along
feature extraction functionality
different convolutional groups
detailed ablation studies
deep learning framework
convolutional neural networks
channel random coding
scale feature fusion
low recognition accuracy
de datasets demonstrate
model &# 8217
attention mechanism associated
scale features
de dataset
model registers
model capitalizes
classification accuracy
xlink ">
vital area
transformer architecture
thus minimizing
parameter redundancy
novel improvement
conventional models
considerable potential
computer vision
combined improvement
cnns ),
bring forth
backbone network
accurate identification
73 %,
72 %,
46 %.
03 %,
status_str publishedVersion
title The detection performance for different dataset.
title_full The detection performance for different dataset.
title_fullStr The detection performance for different dataset.
title_full_unstemmed The detection performance for different dataset.
title_short The detection performance for different dataset.
title_sort The detection performance for different dataset.
topic Medicine
Neuroscience
Cancer
Space Science
Biological Sciences not elsewhere classified
Information Systems not elsewhere classified
steel surfaces constitutes
pose significant difficulties
incorporating resnet18 along
feature extraction functionality
different convolutional groups
detailed ablation studies
deep learning framework
convolutional neural networks
channel random coding
scale feature fusion
low recognition accuracy
de datasets demonstrate
model &# 8217
attention mechanism associated
scale features
de dataset
model registers
model capitalizes
classification accuracy
xlink ">
vital area
transformer architecture
thus minimizing
parameter redundancy
novel improvement
conventional models
considerable potential
computer vision
combined improvement
cnns ),
bring forth
backbone network
accurate identification
73 %,
72 %,
46 %.
03 %,