Time(s) and GFLOPs savings of single-agent tasks.

<div><p>Deep reinforcement learning has achieved significant success in complex decision-making tasks. However, the high computational cost of policies based on deep neural networks restricts their practical application. Specifically, each decision made by an agent requires a complete ne...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Hongjie Zhang (136127) (author)
مؤلفون آخرون: Zhenyu Chen (2359471) (author), Hourui Deng (20685396) (author), Chaosheng Feng (20685399) (author)
منشور في: 2025
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:<div><p>Deep reinforcement learning has achieved significant success in complex decision-making tasks. However, the high computational cost of policies based on deep neural networks restricts their practical application. Specifically, each decision made by an agent requires a complete neural network computation, leading to a linear increase in computational cost with the number of interactions and agents. Inspired by human decision-making patterns, which involve reasoning only on critical states in continuous decision-making tasks without considering all states, we introduce the <i>LazyAct</i> algorithm. This algorithm significantly reduces the number of inferences while preserving the quality of the policy. Firstly, we incorporate a state skipping branch into the actor network to bypass states with minimal impact. Subsequently, we establish optimization objectives for single-agent and multi-agents inference, incorporating cost constraints based on the IMPALA and MAPPO frameworks, respectively. Finally, we utilize pre-training and fine-tuning techniques to train the policy network. Extensive experimental results indicate that <i>LazyAct</i> reduces the number of inferences by approximately 80% and 40% in single-agent and multi-agents scenarios, respectively, while sustaining comparable policy performance. The inferences reduction significantly decreases the time and FLOPs required by the <i>LazyAct</i> algorithm to complete tasks. Code is available here <a href="https://www.dropbox.com/scl/fo/wyoqo6q9gyt86zobfgbvx/h?\rlkey=0moyxsnoiisfs9y4h89hsou1l&dl=0" target="_blank">https://www.dropbox.com/scl/fo/wyoqo6q9gyt86zobfgbvx/h?\rlkey=0moyxsnoiisfs9y4h89hsou1l&dl=0</a>.</p></div>