A: The sequence learning setup.
<p>In the full task, the student is required to take a sequence of <i>N</i> correct actions to get reward. In intermediate levels of the task, the reward is delivered if the student takes correct actions. is the innate bias of the student to take the correct action at the <i>...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | , |
| Published: |
2025
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | <p>In the full task, the student is required to take a sequence of <i>N</i> correct actions to get reward. In intermediate levels of the task, the reward is delivered if the student takes correct actions. is the innate bias of the student to take the correct action at the <i>i</i>th step, prior to training. We assume for all <i>i</i> unless otherwise specified. B: The incremental teacher (INC) fails once . C: The <i>q</i> values (in grayscale) for the correct action at each step shown for (top) and (bottom). The red line shows the assigned task level. Note the striped dynamics in the top row caused due to alternating reinforcement and extinction. In the bottom row, <i>ε</i> is too small, forcing learning to stall. D: Time series of <i>q</i> values for actions at the first (solid black) and third (dashed gray) steps for the two examples shown in panel C.</p> |
|---|