A: The sequence learning setup.

<p>In the full task, the student is required to take a sequence of <i>N</i> correct actions to get reward. In intermediate levels of the task, the reward is delivered if the student takes correct actions. is the innate bias of the student to take the correct action at the <i>...

Full description

Saved in:
Bibliographic Details
Main Author: William L. Tong (22238845) (author)
Other Authors: Venkatesh N. Murthy (15354261) (author), Gautam Reddy (11927277) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<p>In the full task, the student is required to take a sequence of <i>N</i> correct actions to get reward. In intermediate levels of the task, the reward is delivered if the student takes correct actions. is the innate bias of the student to take the correct action at the <i>i</i>th step, prior to training. We assume for all <i>i</i> unless otherwise specified. B: The incremental teacher (INC) fails once . C: The <i>q</i> values (in grayscale) for the correct action at each step shown for (top) and (bottom). The red line shows the assigned task level. Note the striped dynamics in the top row caused due to alternating reinforcement and extinction. In the bottom row, <i>ε</i> is too small, forcing learning to stall. D: Time series of <i>q</i> values for actions at the first (solid black) and third (dashed gray) steps for the two examples shown in panel C.</p>