Training results for the grid world environment.

<p>a) Evolution of the length of the trajectories during the training, for different scaling parameters ranging from −3 to 3, and different preference distributions: the agent can either learn to complete the task from the start (“task”), or first explore the grid (“explore”). We represent the...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Joséphine Pazem (22184363) (author)
مؤلفون آخرون:	Marius Krumm (22184366) (author), Alexander Q. Vining (11320591) (author), Lukas J. Fiderer (4865587) (author), Hans J. Briegel (6383642) (author)
منشور في:	2025
الموضوعات:	Science Policy Biological Sciences not elsewhere classified Information Systems not elsewhere classified using internal rewards partially observable grid free energy principle expected free energy agents &# 8217 timed response task various reinforcement learning partially observable environments feps agents build interpretability </ p build upon model agents navigation task given task feps ). feps model xlink "> understanding aspects term goals target observation results show recent work prediction accuracy observations based multidisciplinary interest mathematical models last decade including elements constraints imposed complex environments behavioral biology appropriately contextualizing active inference
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

كن أول من يترك تعليقا!

Training results for the grid world environment.

مواد مشابهة