Our behaviors range from mindful, deliberative streams of action to sequences of action that are so nearly automatic that we can perform them almost without thinking. Transitions between these modes of behavior occur as we learn behavioral routines. We have studied these transitions and the neural activity that occurs in corticostriatal loops as they take place. We find that neural activity in these loops is strongly modified during habit learning and that specific corticostriatal circuits can powerfully control value-based decision-making and habits.
As we move about and act in our environment, the brain constantly updates not only our physical position and the moment-to-moment stimuli around us, but also updates the value of the actions that we perform. How these values are attached to our behaviors is still incompletely understood.
In our laboratory, we have approached this issue by teaching animals to perform simple habits, capitalizing on much evidence that, at first, behaviors that are candidate habits are sensitive to reinforcement, but later they become nearly independent of whether or not the performance of the behavior is reinforced.
We have found that as this behavioral transition occurs, the spike activity and local field potential activity recorded in the prefrontal cortex and striatum are also transformed (Jog et al. 1999; Barnes et al. 2005; Thorn et al. 2010; Smith and Graybiel 2013). In typical experiments, we have taught rodents to run in simple T-mazes, with cues indicating to them whether to turn left or right to receive a food reward. The neural activity in regions known to be necessary for habit formation gradually shifts: early on, the population activity in the sensorimotor part of the striatum is high during the full time of the maze runs, but later during the learning process, the population activity becomes concentrated at the action points of the runs, especially the beginning and end of the runs. As the behavior of the animals becomes fully habitual through extensive training (called ‘over-training’) on the task, this beginning-and-end bracketing pattern becomes nearly fixed within the sensorimotor striatum. A quite similar bracketing pattern later develops in the prefrontal cortex, but it remains sensitive to reinforcement; if rewards are made unpalatable, then the animals cease the habitual runs and the cortical bracketing activity pattern becomes degraded.
We then found that we could block already formed habits and even toggle the habit off and on by optogenetically suppressing this prefrontal cortical activity (Smith et al. 2012). Comparable optogenetic inhibition of the same small prefrontal cortical zone could block the formation of habits altogether when the optogenetic inhibition was applied during the over-training period (Smith and Graybiel 2013).
These experiments raise the possibility that neural circuits involving the medial prefrontal cortex can evaluate whether actions are beneficial and should be allowed to be performed. The fact that this apparent control is effective even for behaviors that seem to be nearly fully automatic suggests that there is on-line, value-related control of behavior.
This potential was vividly seen in other experiments in which we blocked compulsive grooming behavior in a mouse model of obsessive-compulsive disorder by manipulating an orbitofrontal corticostriatal circuit (Burguiere et al. 2013). In these experiments, we could block a conditioned compulsion by intervening either at the level of the cortex or at the level of the medial striatum. Therefore, the control was exerted by a corticostriatal circuit.
In a new set of experiments, we have asked whether we can identify critical corticostriatal circuits that operate in these deliberative or repetitive decisions. We focused on a circuit that is thought to lead from localized zones in the prefrontal cortex to striosomes. These are dispersed zones within the striatum that can access the dopamine-containing neurons of the midbrain (Crittenden and Graybiel 2011; Fujiyama et al. 2011; Watabe-Uchida et al. 2012). We mimicked a situation often faced in everyday life, in which we can acquire something, but only at a cost. In this situation, costs as well as benefits have to be weighed. We used decision-making tasks in which animals were required to choose an action sequence in response to cues indicating that mixtures of rewarding and annoying reinforcers could either be accepted or be rejected. This design meant that the animals could reject an offer, but then they would miss out on the reward coupled to the cost.
This kind of decision-making, given the name ‘approach-avoidance decision-making,’ has been studied extensively in human subjects, particularly in relation to distinguishing between anxiety and depression in affected individuals who face conflicting motivations to approach and to avoid. We thus were attempting to target forms of decision-making that, in humans, involve value-based estimates of the future.
In initial studies, Dr. Ken-ichi Amemori and I focused on the pregenual anterior cingulate cortex in macaque monkeys (Amemori and Graybiel 2012), which earlier work had shown to project preferentially to striosomes in the head of the caudate nucleus (Eblen and Graybiel 1995). There, many neurons increased their activity during the decision period, either when the monkey would subsequently choose an approach response (accepting the good and bad symbolized by cues on a computer screen) or when the monkey would subsequently reject the offer. In one localized pregenual region, the avoidance-related neurons outnumbered the approach-related neurons. At other sites, similar numbers of these two classes were recorded. Microstimulation applied during the decision period had little or no effect on the decisions at most sites, but in the regions matching the sites with predominance of avoidance-related neurons, the microstimulation induced significant increases in avoidance. We found that treatment with the anxiolytic diazepam could block the microstimulation effects. Notably, we found no effects of the microstimulation in a control ‘approach-approach’ task in which both offered options were good.
In subsequent, still-ongoing experiments, Ken-ichi Amemori, Satoko Amemori and I are determining whether, as initial results suggest, the ‘hot-spot’ for pessimistic decision-making preferentially projects to striosomes (Amemori et al. in preparation). If so, these experimental findings would squarely place the corticostriatal system interacting with striosomes as part of the circuitry underpinning decision-making in which conflicting motivations must be handled.
With the technical opportunities presented by work in rodents, we returned to T-maze experiments, but this time introduced costs and benefits at each end-arm of the mazes. In work spearheaded by Alexander Friedman, Daigo Homma, and Leif Gibb, with Ken-ichi Amemori and others, we found striking evidence for a selective functional engagement of a striosome-targeting prefrontal circuit (Friedman et al. 2015). The evidence rests on the use of multiple decision-making tasks, presenting cost-benefit, benefit-benefit, reverse cost-benefit and cost-cost decision-making challenges to the animals. We then used optogenetics to interrupt the cortico-striosomal circuit. Across all of these tasks, it was only in the cost-benefit task that the putative striosome-targeting prefrontal pathway was engaged. By contrast, comparable optogenetic experiments inhibiting a matrix-targeting prefronto-striatal circuit produced effects on decision-making in all of the tasks.
Evidence from our own and other laboratories suggests that striosomes may have privileged access to the dopamine-containing neurons of the substantia nigra pars compacta, either directly or by way of a multi-synaptic pathway via the lateral habenula (Rajakumar et al. 1993; Graybiel 2008; Stephenson-Jones et al. 2013). The details of these pathways remain unknown. It is known, however, that the lateral habenula neurons increase their firing rates to negative reinforcers or their predictors; the dopamine-containing nigral neurons fire in relation to positive or, in some populations, to negative reinforcers and predictors (Hong and Hikosaka 2013). This potential dual downstream circuitry, combined with the experimental evidence summarized here, suggests that striosomes could be nodal sites in mood- and emotion-related corticostriatal networks influencing downstream modulators of motivational states.
References
Barnes T, Kubota Y, Hu D, Jin DZ, Graybiel AM (2005) Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories. Nature 437:1158–1161. [
PubMed: 16237445] [
CrossRef]
Fujiyama F, Sohn J, Nakano T, Furuta T, Nakamura KC, Matsuda W, Kaneko T (2011) Exclusive and common targets of neostriatofugal projections of rat striosome neurons: a single neuron-tracing study using a viral vector. Eur J Neurosci 33:668–677. [
PubMed: 21314848] [
CrossRef]
Graybiel AM (2008) Habits, rituals and the evaluative brain. Annu Rev Neurosci 31:359–387. [
PubMed: 18558860] [
CrossRef]
Hong S, Hikosaka O (2013) Diverse sources of reward value signals in the basal ganglia nuclei transmitted to the lateral habenula in the monkey. Front Hum Neurosci 7:778. [
PMC free article: PMC3826593] [
PubMed: 24294200]
Jog M, Kubota Y, Connolly CI, Hillegaart V, Graybiel AM (1999) Building neural representations of habits. Science 286:1745–1749. [
PubMed: 10576743] [
CrossRef]
Rajakumar N, Elisevich K, Flumerfelt BA (1993) Compartmental origin of the striato-entopeduncular projection in the rat. J Comp Neurol 331:286–296. [
PubMed: 8509503] [
CrossRef]
Watabe-Uchida M, Zhu L, Ogawa SK, Vamanrao A, Uchida N (2012) Whole-brain mapping of direct inputs to midbrain dopamine neurons. Neuron 74:858–873. [
PubMed: 22681690] [
CrossRef]