Brain learns faster from rare rewards than from repetition

by Clarence Oxford
Los Angeles CA (SPX) Feb 20, 2026
More than a century after Pavlov trained dogs to link a bell with food, neuroscientists at the University of California, San Francisco report that the brain may rely more on the timing of rewards than on sheer repetition when forming associative memories of cues and outcomes. Their work suggests that what matters for learning is how far apart cue-reward experiences are spaced in time, rather than how many times those pairings occur in total.

The study, led by Vijay Mohan K. Namboodiri, PhD, an associate professor of Neurology at UCSF, challenges a longstanding view that associative learning is primarily a trial-and-error process in which repeated pairings gradually strengthen expectations. Instead, the researchers propose that the interval between cue and reward experiences controls how much each new instance updates the brain's internal model, with longer gaps leading the brain to extract more information from each event.

Traditional theories hold that when an animal first encounters a cue followed by a reward, dopamine neurons fire at the time of the reward, and that with enough repetitions, dopamine release shifts to the cue as the brain learns to predict the outcome. In this framework, each reward delivery causes a small adjustment in the prediction, increasing it when the reward arrives as expected and decreasing it when it does not. The new UCSF work reinterprets this process by emphasizing how the passage of time between learning episodes scales the brain's learning rate.

To test their ideas, Namboodiri and postdoctoral scholar Dennis Burke, PhD, trained mice to associate a brief sound with access to sugar-sweetened water, while systematically varying the time between sound-and-reward trials. Some mice experienced trials 30 to 60 seconds apart, whereas others encountered the same cue-reward sequence only once every five to ten minutes or longer. As a result, the shorter-interval group received many more total rewards than the longer-interval group over the same training period.

If learning depended mainly on the total number of cue-reward pairings, mice with more frequent trials should have acquired the association more quickly. Instead, the researchers found that mice receiving far fewer rewards learned just as much as animals that experienced roughly twenty times more trials in the same amount of time. This parity in learning across very different repetition counts points to the importance of temporal spacing in determining how strongly each reward influences future expectations.

According to Burke, the findings indicate that associative learning follows a principle closer to "timing is everything" than to "practice makes perfect," because events that are spaced farther apart deliver a larger increment of learning per occurrence. When the UCSF team monitored dopamine activity in the brains of the mice, they observed that longer spacing between rewards allowed dopamine responses to shift from the reward to the cue after fewer repetitions, consistent with a higher learning rate under sparse conditions.

In a further experiment, the team dissociated cue frequency from reward frequency by playing the sound every 60 seconds but delivering sugar water only about 10 percent of the time. Under this intermittent reward schedule, mice began releasing dopamine in response to the sound after relatively few actual rewards, regardless of whether any particular cue was followed by sugar water. This pattern reinforces the idea that rare but informative reward events drive rapid updating of expectations, even when most cues are not rewarded.

The results have implications for how scientists understand learning in everyday life and in conditions such as addiction. Behaviors like smoking often involve irregular, intermittent experiences of nicotine tied to strong environmental cues, such as the sight or smell of cigarettes, which can powerfully trigger the urge to smoke. Continuous-delivery treatments like nicotine patches may work in part by breaking the tight coupling between discrete nicotine intake events and dopamine surges, thereby weakening the learned association and reducing craving.

Namboodiri now plans to explore how this timing-based learning framework could inform the design of faster and more efficient artificial intelligence systems. Many current AI algorithms adjust their internal models by making small updates after nearly every interaction across enormous data sets, a process that can be computationally expensive and slow. A learning model that, like the mouse brain in these experiments, draws more information from sparse but strategically spaced experiences could enable AI to converge on accurate predictions with far fewer training examples.

The researchers note that, for now, biological brains still far outpace machines in their ability to extract structure from limited and irregular data streams. By clarifying how the brain uses the spacing of rewards and cues to control how strongly each event shapes future expectations, the UCSF study offers a new perspective on why natural learning can be so efficient and how engineered systems might be redesigned to emulate that efficiency.

Research Report:Duration between rewards controls the rate of behavioral and dopaminergic learning