Today Sci is going to blog a paper that she has been meaning to blog for a long time. It’s one of those papers that people who do certain kinds of science snuggle with when they go to sleep at night.
(Sci and this paper)
But the real reason that Sci loves this paper is that it’s the neurobiological equivilant of a RickRoll.
And the question behind this paper is: what is the mechanism behind reward prediction?
Schultz, Dayan, and Montague. “A neural substrate of prediction and reward” Science, 1997.
Now at this point you might be asking yourself: what the heck is reward prediction and why does anyone care about it? Reward prediction is in fact an extremely important thing in any organism’s life. If you can’t predict where and when you’re going to get food, shelter, or sex in response to specific stimuli, you’re going to be a very hungry, chilly, and undersexed organism. An ability to be able to predict a reward is especially good because it allows you to gage your behavioral reactions accordingly. For example, Sci’s reaction to being told she’s going to get a Hershey’s kiss is going to be markably different than her response to being told she gets an entire Snicker’s bar.
Rewarding stimuli in particular elicit a very specific series of actions in specific animals. For example, if you’re a rat, a signal for food is going to signal “approach” behavior, in which the rat is going to head over and get himself a sandwich.
But his experience with rewarding objects is going to be different depending on what he EXPECTS. If, for example, he’s been given to expect a medium sized reward, like half a sandwich, and gets a WHOLE sandwich, he’s going to react more strongly. If he thinks he’s going to get a half sandwich and all he gets is half a tomato, well, that’s just disappointing. These responses have been established through many long years of conditioning experiments, a la Pavlov’s dogs.
Basically (for those who haven’t heard of Pavlov and his pooches), Pavlov gave the dogs meat powder, and every time he did, he rang a bell. Obviously, when dogs taste meat powder, they salivate, and start drooling all over the place as only dogs can. By pairing the meat powder with the bell, pretty soon, every time he just rang the BELL, the dogs started salivating whether or not the meat powder was there, because they had come to expect it.
This phenomenon is called conditioning. The first mean powder is called the unconditioned stimulus. The bell is the conditioned stimulus. The dogs first salivating response to the unconditioned stimulus is the unconditioned response. When the dogs learn to associate the meat stimulus with the conditioned stimulus of the bell.
For a long time now, scientists have known that the neurotransmitter dopamine is involved in the rewarding aspects of things, including food and drugs. Right now, it is thought that dopamine neuron firing helps to process and construct information about possible rewarding events. But it was this paper that showed, for the first time, that dopamine neurons were really involved in the PREDICTION of reward. And here’s what they did:
First, they took a bunch of monkeys (it has also been done in rats) and implanted electrodes to record neurons in the Ventral Tegmental Area of the brain, an area that contains lots of dopamine neurons. With these electrodes, they could watch the neurons fire. In this case, they gave the monkeys an unexpected reward, fruit juice.
See that spike above the “R”? That spike is a spike in dopamine neuron activity when the monkeys on unexpected fruit juice. The neurological equivalent of “w00t!”.
They then trained the rats on a conditioned stimulus paradigm. Basically, they paired a tone or light with a dose of fruit juice for the monkey. This meant that, when the monkey was done learning, it knew that when it got a light, fruit juice was forthcoming. And the neurons in the monkeys brains SHOWED the result of the learning. It looked like this:
This is a condition where the monkey was given the tone (or light), and got the reward it expected. You can see that here, the spike in dopamine neuron activity has shifted, this time corresponding to the tone (woohoo! juice is on the way!) rather than to the reward itself.
But then, what happens when the conditioned stimulus of the tone or light is given, and no juice arrives?
The spike is there, the monkey is waiting for juice. No juice arrives. And instead of normal firing, when no reward appears, there’s a DECREASE in dopamine neuron firing (the circled portion).
That monkey’s been Rickrolled.
(Does that hurt you like it hurts me?)
But this Rickroll is cool. First, it was the first time that anyone had shown that learning a conditioned stimulus (in this case, the light with the juice) actually shifted neuron activity to the stimulus, rather than the reward. And it ALSO showed that the timing of the reward was ALSO encoded. That monkey knew WHEN the reward was expected, and knew when it had been had. It showed that dopamine neurons are part of a system encoding the expectation of rewards and stimuli. Dopamine neurons don’t just encode reward, rather, they encode the expectation, and respond to whether it happens or doesn’t, with a spike if the reward is better than expected, and a dip if it’s worse.
This paper is pretty old in scientific terms (1997! Come on, now), but it remains the basis for a lot of reward expectation studies today, and a lot of studies have been built on this paper. And why not? A neurobiological Rickroll is the stuff of which great science is made!
Schultz, W. (1997). A Neural Substrate of Prediction and Reward Science, 275 (5306), 1593-1599 DOI: 10.1126/science.275.5306.1593