Goal-directed Learning in the Striatum Science,  31 Jan 2020: Vol. 367, Issue 6477, pp. 549-555 Local D2- to D1-neuron transmodulation updates goal-directed learning in the striatum Miriam Matamales, et.al. Decision Neuroscience Laboratory, School of Psychology, University of New South Wales, Sydney, NSW, Australia. Department of Anatomy and Neuroscience, University of Melbourne, Melbourne, VIC, Australia. School of Biomedical Sciences, University of Queensland, St Lucia, QLD, Australia. Department of Women and Children’s Health, Faculty of Life Sciences and Medicine, King’s College London, London SE1 7EH, UK. [paraphrase] Extinction learning allows animals to withhold voluntary actions that are no longer related to reward and so provides a major source of behavioral control. Although such learning is thought to depend on dopamine signals in the striatum, the way the circuits that mediate goal-directed control are reorganized during new learning remains unknown. Here, by mapping a dopamine-dependent transcriptional activation marker in large ensembles of spiny projection neurons (SPNs) expressing dopamine receptor type 1 (D1-SPNs) or 2 (D2-SPNs) in mice, we demonstrate an extensive and dynamic D2- to D1-SPN transmodulation across the striatum that is necessary for updating previous goal-directed learning. Our findings suggest that D2-SPNs suppress the influence of outdated D1-SPN plasticity within functionally relevant striatal territories to reshape volitional action. In changing environments, it is adaptive for humans and other animals flexibly to adjust their actions to maximize reward. Extinction learning allows individuals to withhold instrumental actions when their consequences change. Rather than erasing such actions from one’s repertoire, current views propose that extinction generates new inhibitory learning that, when incorporated into previously acquired behavior, acts selectively to reduce instrumental performance. Associative learning theory identifies the negative prediction errors produced by the absence of an anticipated reward as the source of the inhibitory learning underlying instrumental extinction. Such signals are thought to involve pauses in dopamine (DA) activity, and this pattern is well suited to alter plasticity in the posterior dorsomedial striatum (DMS), a key structure encoding the action-outcome associations necessary for goal-directed learning. Nevertheless, the way complex DA signals alter postsynaptic circuits in the DMS to shape goal-directed learning remains unknown. Within the DMS, the plasticity associated with goal-directed learning involves glutamate release timed to local DA activity to alter intracellular cyclic adenosine monophosphate (cAMP)–dependent pathways in postsynaptic neurons, a function that involves slow temporal scales and that leads to gene transcription necessary for learning. This activity is distributed across two major subpopulations of spiny projection neurons (SPNs)—the principal targets of DA. These are completely intermixed within the striatum and express distinct DA receptor subtypes that respond to DA in an opposing manner:    Half express type 1 receptors and trigger powerful cAMP signaling in DA-rich states (D1-SPNs), whereas the other half express type 2 receptors and show robust signaling in DA-lean states (D2-SPNs). Given that positive and negative prediction errors during appetitive learning are known to influence DA release, we hypothesized that prediction errors during reward and extinction learning generate distinctive molecular activation patterns in D1- and D2-SPNs across the striatum to provide a molecular signature identifying those regions most relevant for plasticity. One of the most intriguing characteristics of the striatum is the random spatial distribution and high degree of intermingling between its D1 (direct) and D2 (indirect) projection systems, a feature that is actively promoted developmentally and that has been retained throughout evolution. The result is a highly entropic binary mosaic that extends through an expansive and homogeneous space and that is mostly devoid of histological boundaries. Such organization is unusual in the brain and can be seen as an adaptation to provide an optimal postsynaptic scaffold for the integration of regionally meaningful neuromodulatory signals. In such a plain, borderless environment, the rules established locally by D1 and D2-SPNs are likely to be critical in defining functional territories throughout the striatum, and this, we propose, is the key process shaping striatal-dependent learning. [end of paraphrase]