Comparison of PID and RL agents.

<div><p>Flight controls are experiencing a major shift with the integration of reinforcement learning (RL). Recent studies have demonstrated the potential of RL to deliver robust and precise control across diverse applications, including the flight control of fixed-wing unmanned aerial v...

Full description

Saved in:
Bibliographic Details
Main Author: Hasan Raza Khanzada (22404835) (author)
Other Authors: Adnan Maqsood (22404838) (author), Abdul Basit (174463) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<div><p>Flight controls are experiencing a major shift with the integration of reinforcement learning (RL). Recent studies have demonstrated the potential of RL to deliver robust and precise control across diverse applications, including the flight control of fixed-wing unmanned aerial vehicles (UAVs). However, a critical gap persists in the rigorous evaluation and comparative analysis of leading continuous-space RL algorithms. This paper aims to provide a comparative analysis of RL-driven flight control systems for fixed-wing UAVs in dynamic and uncertain environments. Five prominent RL algorithms that include Deep Deterministic Policy Gradient (DDPG), Twin Delayed Deep Deterministic Policy Gradient (TD3), Proximal Policy Optimization (PPO), Trust Region Policy Optimization (TRPO) and Soft Actor-Critic (SAC) are evaluated to determine their suitability for complex UAV flight dynamics, while highlighting their relative strengths and limitations. All the RL agents are trained in a same high fidelity simulation environment to control pitch, roll and heading of the UAV under varying flight conditions. The results demonstrate that RL algorithms outperformed the classical PID controllers in terms of stability, responsiveness and robustness, especially during environmental disturbances such as wind gusts. The comparative analysis reveals that the SAC algorithm achieves convergence in 400 episodes and maintains a steady-state error below 3%, offering the best trade-off among the evaluated RL algorithms. This analysis aims to provide valuable insight for the selection of suitable RL algorithm and their practical integration into modern UAV control systems.</p></div>