Overall metrics results.

<div><p>To solve the problem of reduced positioning accuracy caused by changes in scale, background and occlusion in port and dock video images, this research proposes an enhanced model combining YOLOv5s-DeepSORT, integrating target load recognition and trajectory tracking to improve ada...

Full description

Saved in:
Bibliographic Details
Main Author: Chengzhi Wang (2256133) (author)
Other Authors: Donghong Chen (3601139) (author), Zhen Liu (74646) (author), Yuanhao Li (141505) (author), Yifei Wang (95207) (author), Sanglan Zhao (21696099) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<div><p>To solve the problem of reduced positioning accuracy caused by changes in scale, background and occlusion in port and dock video images, this research proposes an enhanced model combining YOLOv5s-DeepSORT, integrating target load recognition and trajectory tracking to improve adaptability to dock environments. The findings indicate that incorporating multi-scale convolution into YOLOv5s improved the robustness of multi-scale object detection, resulting in a 0.4% increase in mean Average Precision (mAP). Furthermore, the integration of an efficient pyramid segmentation attention (EPSA) network enhanced the accuracy of multi-scale feature fusion representation. The model’s mAP@0.5:0.95 increased by 1.2% following the introduction of EPSA. Finally, the original classification loss function was enhanced using a distributed sorting loss approach to mitigate the imbalance among loaded objects and the influence of background variations in the dock image sequence. This optimization led to a 3.1% improvement in multi-target tracking accuracy (MOTA). Experimental results on self-constructed datasets demonstrated an average accuracy of 90.9% and a detection accuracy of 92.2%, offering a valuable reference for target recognition and tracking in port and dock environments.</p></div>