The average cumulative reward of algorithms.

<div><p>The H-beam riveting and welding work cell is an automated unit used for processing H-beams. By coordinating the gripping and welding robots, the work cell achieves processes such as riveting and welding stiffener plates, transforming the H-beam into a stiffened H-beam. In the con...

Full description

Saved in:

Bibliographic Details
Main Author:	Jianbin Zheng (587000) (author)
Other Authors:	Chuyi Zhou (11905051) (author), Yang Gao (18005) (author), Ziyao Chen (16818545) (author), Yifan Gao (2344960) (author), Yizhuo Zhang (319076) (author), Xinyu Zhou (736229) (author), Yuanzheng Ou (22184118) (author)
Published:	2025
Subjects:	Cell Biology Infectious Diseases Space Science Biological Sciences not elsewhere classified Information Systems not elsewhere classified value function normalization still significant potential shared reward mechanism reinforcement learning environment process historical information physical work cell paper first analyzes experimental results show digital twin platform demonstrate greater adaptability make scheduling decisions welding work cells welding work cell welding stiffener plates handling different riveting automated unit used agent scheduling problem beam processing flow agent scheduling welding tasks welding robots beam processing agent system processing h xlink "> significantly enhance masac ). manufacturing efficiency invalid states intelligent manufacturing beam riveting baseline methods appropriately simplifies action masking accelerate convergence
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	<div><p>The H-beam riveting and welding work cell is an automated unit used for processing H-beams. By coordinating the gripping and welding robots, the work cell achieves processes such as riveting and welding stiffener plates, transforming the H-beam into a stiffened H-beam. In the context of intelligent manufacturing, there is still significant potential for improving the productivity of riveting and welding tasks in existing H-beam riveting and welding work cells. In response to the multi-agent system of the H-beam riveting and welding work cell, a recurrent multi-agent proximal policy optimization algorithm (rMAPPO) is proposed to address the multi-agent scheduling problem in the H-beam processing. The algorithm employs recurrent neural networks to capture and process historical information. Action masking is used to filter out invalid states and actions, while a shared reward mechanism is adopted to balance cooperation efficiency among agents. Additionally, value function normalization and adaptive learning rate strategies are applied to accelerate convergence. This paper first analyzes the H-beam processing flow and appropriately simplifies it, develops a reinforcement learning environment for multi-agent scheduling, and applies the rMAPPO algorithm to make scheduling decisions. The effectiveness of the proposed method is then verified on both the physical work cell for riveting and welding and its digital twin platform, and it is compared with other baseline multi-agent reinforcement learning methods (MAPPO, MADDPG, and MASAC). Experimental results show that, compared with other baseline methods, the rMAPPO-based agent scheduling method can reduce robot waiting times more effectively, demonstrate greater adaptability in handling different riveting and welding tasks, and significantly enhance the manufacturing efficiency of stiffened H-beam.</p></div>

The average cumulative reward of algorithms.

Similar Items