Stanford Advisors


All Publications


  • Slow-rising and fast-falling dopaminergic dynamics jointly adjust negative prediction error in the ventral striatum. The European journal of neuroscience Shikano, Y., Yagishita, S., Tanaka, K. F., Takata, N. 2023

    Abstract

    The greater the reward expectations are, the more different the brain's physiological response will be. Although it is well-documented that better-than-expected outcomes are encoded quantitatively via midbrain dopaminergic (DA) activity, it has been less addressed experimentally whether worse-than-expected outcomes are expressed quantitatively as well. We show that larger reward expectations upon unexpected reward omissions are associated with the preceding slower rise and following larger decrease (DA dip) in the DA concentration at the ventral striatum of mice. We set up a lever press task on a fixed ratio (FR) schedule requiring five lever presses as an effort for a food reward (FR5). The mice occasionally checked the food magazine without a reward before completing the task. The percentage of this premature magazine entry (PME) increased as the number of lever presses approached five, showing rising expectations with increasing proximity to task completion, and hence greater reward expectations. Fiber photometry of extracellular DA dynamics in the ventral striatum using a fluorescent protein (genetically encoded GPCR-activation-based-DA sensor: GRABDA2m ) revealed that the slow increase and fast decrease in DA levels around PMEs were correlated with the PME percentage, demonstrating a monotonic relationship between the DA dip amplitude and degree of expectations. Computational modeling of the lever press task implementing temporal difference errors and state transitions replicated the observed correlation between the PME frequency and DA dip amplitude in the FR5 task. Taken together, these findings indicate that the DA dip amplitude represents the degree of reward expectations monotonically, which may guide behavioral adjustment.

    View details for DOI 10.1111/ejn.15945

    View details for PubMedID 36843200

  • A reinforcement learning model with choice traces for a progressive ratio schedule. Frontiers in behavioral neuroscience Ihara, K., Shikano, Y., Kato, S., Yagishita, S., Tanaka, K. F., Takata, N. 2023; 17: 1302842

    Abstract

    The progressive ratio (PR) lever-press task serves as a benchmark for assessing goal-oriented motivation. However, a well-recognized limitation of the PR task is that only a single data point, known as the breakpoint, is obtained from an entire session as a barometer of motivation. Because the breakpoint is defined as the final ratio of responses achieved in a PR session, variations in choice behavior during the PR task cannot be captured. We addressed this limitation by constructing four reinforcement learning models: a simple Q-learning model, an asymmetric model with two learning rates, a perseverance model with choice traces, and a perseverance model without learning. These models incorporated three behavioral choices: reinforced and non-reinforced lever presses and void magazine nosepokes, because we noticed that male mice performed frequent magazine nosepokes during PR tasks. The best model was the perseverance model, which predicted a gradual reduction in amplitudes of reward prediction errors (RPEs) upon void magazine nosepokes. We confirmed the prediction experimentally with fiber photometry of extracellular dopamine (DA) dynamics in the ventral striatum of male mice using a fluorescent protein (genetically encoded GPCR activation-based DA sensor: GRABDA2m). We verified application of the model by acute intraperitoneal injection of low-dose methamphetamine (METH) before a PR task, which increased the frequency of magazine nosepokes during the PR session without changing the breakpoint. The perseverance model captured behavioral modulation as a result of increased initial action values, which are customarily set to zero and disregarded in reinforcement learning analysis. Our findings suggest that the perseverance model reveals the effects of psychoactive drugs on choice behaviors during PR tasks.

    View details for DOI 10.3389/fnbeh.2023.1302842

    View details for PubMedID 38268795

    View details for PubMedCentralID PMC10806202