Goal-directed Learning in the Striatum
Science, 31 Jan 2020: Vol. 367, Issue 6477, pp. 549-555
Local D2- to D1-neuron transmodulation updates goal-directed learning in
the striatum
Miriam Matamales, et.al.
Decision Neuroscience Laboratory, School of Psychology, University of New South Wales, Sydney, NSW,
Australia.
Department of Anatomy and Neuroscience, University of Melbourne, Melbourne, VIC, Australia.
School of Biomedical Sciences, University of Queensland, St Lucia, QLD, Australia.
Department of Women and Children’s Health, Faculty of Life Sciences and Medicine, King’s College
London, London SE1 7EH, UK.
[paraphrase]
Extinction learning allows animals to withhold voluntary actions that are no longer
related to reward and so provides a major source of behavioral control. Although such
learning is thought to depend on dopamine signals in the striatum, the way the circuits
that mediate goal-directed control are reorganized during new learning remains
unknown. Here, by mapping a dopamine-dependent transcriptional activation marker in
large ensembles of spiny projection neurons (SPNs) expressing dopamine receptor
type 1 (D1-SPNs) or 2 (D2-SPNs) in mice, we demonstrate an extensive and dynamic
D2- to D1-SPN transmodulation across the striatum that is necessary for updating
previous goal-directed learning. Our findings suggest that D2-SPNs suppress the
influence of outdated D1-SPN plasticity within functionally relevant striatal territories
to reshape volitional action.
In changing environments, it is adaptive for humans and other animals flexibly to adjust
their actions to maximize reward. Extinction learning allows individuals to withhold
instrumental actions when their consequences change. Rather than erasing such
actions from one’s repertoire, current views propose that extinction generates new
inhibitory learning that, when incorporated into previously acquired behavior, acts
selectively to reduce instrumental performance.
Associative learning theory identifies the negative prediction errors produced by the
absence of an anticipated reward as the source of the inhibitory learning underlying
instrumental extinction. Such signals are thought to involve pauses in dopamine
(DA) activity, and this pattern is well suited to alter plasticity in the posterior
dorsomedial striatum (DMS), a key structure encoding the action-outcome
associations necessary for goal-directed learning. Nevertheless, the way complex DA
signals alter postsynaptic circuits in the DMS to shape goal-directed learning remains
unknown.
Within the DMS, the plasticity associated with goal-directed learning involves
glutamate release timed to local DA activity to alter intracellular cyclic adenosine
monophosphate (cAMP)–dependent pathways in postsynaptic neurons, a function
that involves slow temporal scales and that leads to gene transcription necessary for
learning. This activity is distributed across two major subpopulations of spiny
projection neurons (SPNs)—the principal targets of DA. These are completely
intermixed within the striatum and express distinct DA receptor subtypes that
respond to DA in an opposing manner: Half express type 1 receptors and trigger
powerful cAMP signaling in DA-rich states (D1-SPNs), whereas the other half
express type 2 receptors and show robust signaling in DA-lean states (D2-SPNs).
Given that positive and negative prediction errors during appetitive learning are known
to influence DA release, we hypothesized that prediction errors during reward and
extinction learning generate distinctive molecular activation patterns in D1- and
D2-SPNs across the striatum to provide a molecular signature identifying those
regions most relevant for plasticity.
One of the most intriguing characteristics of the striatum is the random spatial
distribution and high degree of intermingling between its D1 (direct) and D2
(indirect) projection systems, a feature that is actively promoted developmentally and
that has been retained throughout evolution. The result is a highly entropic binary
mosaic that extends through an expansive and homogeneous space and that is mostly
devoid of histological boundaries. Such organization is unusual in the brain and can
be seen as an adaptation to provide an optimal postsynaptic scaffold for the
integration of regionally meaningful neuromodulatory signals. In such a plain,
borderless environment, the rules established locally by D1 and D2-SPNs are likely
to be critical in defining functional territories throughout the striatum, and this, we
propose, is the key process shaping striatal-dependent learning.
[end of paraphrase]