Training results for the grid world environment.

<p>a) Evolution of the length of the trajectories during the training, for different scaling parameters ranging from −3 to 3, and different preference distributions: the agent can either learn to complete the task from the start (“task”), or first explore the grid (“explore”). We represent the...

Full description

Saved in:

Bibliographic Details
Main Author:	Joséphine Pazem (22184363) (author)
Other Authors:	Marius Krumm (22184366) (author), Alexander Q. Vining (11320591) (author), Lukas J. Fiderer (4865587) (author), Hans J. Briegel (6383642) (author)
Published:	2025
Subjects:	Science Policy Biological Sciences not elsewhere classified Information Systems not elsewhere classified using internal rewards partially observable grid free energy principle expected free energy agents &# 8217 timed response task various reinforcement learning partially observable environments feps agents build interpretability </ p build upon model agents navigation task given task feps ). feps model xlink "> understanding aspects term goals target observation results show recent work prediction accuracy observations based multidisciplinary interest mathematical models last decade including elements constraints imposed complex environments behavioral biology appropriately contextualizing active inference
Tags:	Add Tag No Tags, Be the first to tag this record!

Training results for the grid world environment.

Similar Items