The behaviour of different reinforcement-learning models in a task environment in which unexpected and expected uncertainties were independently manipulated.

<p>All models converge reasonably well with the actual mean of variable rewards. The learning rate for the Rescorla-Wagner model (η, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1013445#pcbi.1013445.e001" target="_blank">Eq 1</a>) i...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Boluwatife Ikwunne (22238697) (author)
مؤلفون آخرون:	Jolie Parham (22238700) (author), Erdem Pulcu (517414) (author)
منشور في:	2025
الموضوعات:	Cell Biology Science Policy Mental Health Biological Sciences not elsewhere classified readily available datasets g ., linear dynamically changing environments agent &# 8217 xlink "> reinforcement logarithmic etc .) learning rates space learning rates accumulate learning </ p agents perform learning novel rl model learning rates novel experiment logarithmic function human reinforcement temporal predictions prediction errors physiological correlates learners observe key components exact nature coefficient used based adaptions
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

الوصف
الملخص:	<p>All models converge reasonably well with the actual mean of variable rewards. The learning rate for the Rescorla-Wagner model (η, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1013445#pcbi.1013445.e001" target="_blank">Eq 1</a>) is 0.32. For the hybrid Pearce-Hall model, ω (<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1013445#pcbi.1013445.e002" target="_blank">Eq 2</a>) is 0.48 and λ is (<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1013445#pcbi.1013445.e004" target="_blank">Eq 3</a>) is 1.56. For the cubic model κ is 0.11 (<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1013445#pcbi.1013445.e005" target="_blank">Eqs 4</a>-<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1013445#pcbi.1013445.e006" target="_blank">5</a>). For the exponential-logarithmic model, the parameters δ and λ are 0.83 and 1.45, respectively (<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1013445#pcbi.1013445.e009" target="_blank">Eq 7</a>). Because models perform ever so comparably, their differences are illustrated in <b>Fig B in</b> <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1013445#pcbi.1013445.s001" target="_blank">S1 Text</a>, showing the average prediction error values relative to the simulated outcomes in the task environment. Note that, the simulation environment shown was generated only once, covering many possibilities of environmental volatility and noise, and their interaction, whereas the models were fitted iteratively until parameters minimising the average magnitude of the prediction error relative to the actual outcome sequence are identified.</p>

The behaviour of different reinforcement-learning models in a task environment in which unexpected and expected uncertainties were independently manipulated.

مواد مشابهة