Stanford Marshmallow Test Experiment

Angel E. Navidad

Philosophy Expert

B.A. Philosophy, Harvard University

Angel Navidad is an undergraduate at Harvard University, concentrating in Philosophy. He will graduate in May of 2025, and thereon pursue graduate study in history, or enter the civil service.

Learn about our Editorial Process

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

On This Page:

Take-home Messages

  • The marshmallow test is an experimental design that measures a child’s ability to delay gratification. The child is given the option of waiting a bit to get their favorite treat, or if not waiting for it, receiving a less-desired treat. The minutes or seconds a child waits measures their ability to delay gratification.
  • The original marshmallow test showed that preschoolers’ delay times were significantly affected by the experimental conditions, like the physical presence/absence of expected treats.
  • The original test sample was not representative of preschooler population, thereby limiting the study’s predictive ability. (Preschool participants were all recruited from Stanford University’s Bing Nursery School, which was then largely patronized by children of Stanford faculty and alumni.)
  • A 2018 study on a large, representative sample of preschoolers sought to replicate the statistically significant correlations between early-age delay times and later-age life outcomes, like SAT scores, which had been previously found using data from the original marshmallow test. The replication study found only weak statistically significant correlations, which disappeared after controlling for socio-economic factors.
  • However, the 2018 study did find statistically significant differences between early-age delay times and later-age life outcomes between children from high-SES families and children from low-SES families, implying that socio-economic factors play a more significant role than early-age self-control in important life outcomes.
In a 1970 paper, Walter Mischel, a professor of psychology at Stanford University, and his graduate student, Ebbe Ebbesen, had found that preschoolers waiting 15 minutes to receive their preferred treat (a pretzel or a marshmallow) waited much less time when either treat was within sight than when neither treat was in view.

marshmallow test

Children with treats present waited 3.09 ± 5.59 minutes; children with neither treat present waited 8.90 ± 5.26 minutes.

The study suggested that gratification delays in children involved suppressing rather than enhancing attention to expected rewards. For instance, some children who waited with both treats in sight would stare at a mirror, cover their eyes, or talk to themselves, rather than fixate on the pretzel or marshmallow.

Mischel, Ebbesen, and Antonette Zeiss, a visiting faculty member at the time, set out to investigate whether attending to rewards cognitively made it more difficult for children to delay gratification.

The Stanford Marshmallow Experiments

Mischel, Ebbesen and Zeiss (1972) designed three experiments to investigate, respectively, the effect of overt activities, cognitive activities, and the lack of either, in the preschoolers’ gratification delay times.

Experiment 1

Fifty-six children from the Bing Nursery School at Stanford University were recruited. To build rapport with the preschoolers, two experimenters spent a few days playing with them at the nursery.

Children were randomly assigned to one of five groups (A – E).

The children were individually escorted to a room where the test would take place. Each child was taught to ring a bell to signal for the experimenter to return to the room if they ever stepped out.

Treat vs. No Treats Condition

Children in groups A, B, and C were shown two treats (a marshmallow and a pretzel) and asked to choose their favorite.

They were then told that the experimenter would soon have to leave for a while but that they’d get their preferred treat if they waited for the experimenter to come back without signaling for them to do so.

They were also explicitly allowed to signal for the experimenter to come back at any point in time but told that if they did, they’d only get the treat they hadn’t chosen as their favorite. Both treats were left in plain view in the room.

Children in groups D and E were given no such choice or instructions.

Children in groups A, B, or C who waited the full 15 minutes were allowed to eat their favored treat. Those in groups A, B, or C who didn’t wait 15 minutes were allowed to have only their non-favored treat.

Children in groups D and E weren’t given treats. All children got to play with toys with the experiments after waiting the full 15 minutes or after signaling.

Distraction vs. No Entertainment Condition

Children in groups A and D were given a slinky and were told they had permission to play with it.

Children in groups B and E were asked to “think of anything that’s fun to think of” and were told that some fun things to think of included singing songs and playing with toys.

Each child’s comprehension of the instructions was tested. Six children didn’t seem to comprehend and were excluded from the test. The remaining 50 children were included.

All 50 were told that whether or not they rang the bell, the experimenter would return, and when he did, they would play with toys.

Waiting time was scored from the moment the experimenter shut the door. The experimenter returned either as soon as the child signaled or after 15 minutes if the child did not signal.

marshmallow-test-results for treat vs no treat condition

The results suggested that children were much more willing to wait longer when they were offered a reward for waiting (groups A, B, C) than when they weren’t (groups D, E)

The results also showed that children waited much longer when they were given tasks that distracted or entertained them during their waiting period (playing with a slinky for group A, thinking of fun things for group B) than when they weren’t distracted (group C).

Experiment 2

This test differed from the first only in the following ways :

  • Thirty-eight children were recruited, with six lost due to incomplete comprehension of instructions.
  • Thirty-two children were randomly assigned to three groups (A, B, C).
  • All children were given a choice of treats, and told they could wait without signalling to have their favourite treat, or simply signal to have the other treat but forfeit their favoured one.
  • In all cases, both treats were left in plain view.
  • Children in group A were asked to think of fun things, as before.
  • Those in group B were asked to think of sad things, and likewise given examples of such things.
  • Those in group C were asked to think of the treats.

marshmallow-test-results for distracted vs not distracted condition

The results suggested that children who were given distracting tasks that were also fun (thinking of fun things for group A) waited much longer for their treats than children who were given tasks that either didn’t distract them from the treats (group C, asked to think of the treats) or didn’t entertain them (group B, asked to think of sad things).

Experiment 3

  • Sixteen children were recruited, and none excluded.
  • Children were randomly assigned to three groups (A, B, C),
  • In all cases, both treats were obscured from the children with a tin cake cover (which children were told would keep the treats fresh).
  • Children in group A were asked to think about the treats.
  • Those in group B were asked to think of fun things, as before.
  • Those in group C were given no task at all.

marshmallow-test-results for distracted vs not distracted condition

The results suggested that when treats were obscured (by a cake tin, in this case), children who were given no distracting or fun task (group C) waited just as long for their treats as those who were given a distracting and fun task (group B, asked to think of fun things).

On the other hand, when the children were given a task that didn’t distract them from the treats (group A, asked to think of the treats), having the treats obscured did not increase their delay time as opposed to having them unobscured (as in the second test).

Final Conclusions

The studies convinced Mischel, Ebbesen, and Zeiss that children’s successful delay of gratification significantly depended on their cognitive avoidance or suppression of the expected treats during the waiting period, e.g., by not having the treats within sight or by thinking of fun things.

Children, they reasoned, could wait a relatively long time if they –

Believed they really would get their favoured treat if they waited (eg by trusting the experimenter, by having the treats remain in the room, whether obscured or in plain view). Shifted their attention away from the treats. Occupied themselves with non-frustrating or pleasant internal or external stimuli (eg thinking of fun things, playing with toys).

Critical Evaluation

  • Sample size determination was not disclosed.
  • The study population (Stanford’s Bind Nursery School) was not characterized and so may differ in relevant respects from the general human population or even the general preschooler population. (In fact, the school was mostly attended by middle-class children of faculty and alumni of Stanford.)
  • The findings might also not extend to voluntary delay of gratification (where the option of having either treat immediately is available, in addition to the studied option of having only the non-favored treat immediately).

Longitudinal Studies Using Stanford Data

Delayed gratification and sat scores.

In 1990, Yuichi Shoda, a graduate student at Columbia University, Walter Mischel, now a professor at Columbia University, and Philip Peake, a graduate student at Smith College, examined the relationship between preschoolers’ delay of gratification and their later SAT scores.

Six-hundred and fifty-three preschoolers at the Bing School at Stanford University participated at least once in a series of gratification delay studies between 1968 and 1974.

Four hundred and four of their parents received follow-up questionnaires. One hundred and eighty-five responded. Ninety-four parents supplied their children’s SAT scores.

Children were divided into four groups depending on whether a cognitive activity (e.g., thinking of fun things) had been suggested before the delay period or not and on whether the expected treats had remained within sight throughout the delay period or not.

The difference in the mean waiting time of the children of parents who responded and that of the children of parents who didn’t respond was not statistically significant (p = 0.09, n = 653).

marshmallow-test-results for delayed gratification and future SAT scores

Preschoolers’ delay times correlated positively and significantly with their later SAT scores when no cognitive task had been suggested and the expected treats had remained in plain sight.

Other correlations were not significant.

Limitations

Shoda, Mischel, and Peake (1990) urged caution in extrapolating their findings since their samples were uncomfortably small.

Delayed Gratification and Positive Functioning

In a 2000 paper, Ozlem Ayduk, at the time a postdoctoral researcher at Columbia, and colleagues, explored the role that preschoolers’ ability to delay gratification played in their later self-worth, self-esteem, and ability to cope with stress.

Five-hundred and fifty preschoolers’ ability to delay gratification in Prof. Mischel’s Stanford studies between 1968 and 1974 was scored.

Each preschooler’s delay score was taken as the difference between the mean delay time of the experimental group the child had been assigned to and the child’s individual score in that group.

Between 1993 and 1995, 444 parents of the original preschoolers were mailed with questionnaires for themselves and their now adult-aged children. A hundred and eighty-seven parents and 152 children returned them.

The questionnaires measured, through nine-point Likert-scale items, the children’s self-worth, self-esteem, and ability to cope with stress. The scores on these items were standardized to derive a positive functioning composite.

The positive functioning composite, derived either from self-ratings or parental ratings, was found to correlate positively with delay of gratification scores.

Preschoolers who were better able to delay gratification were more likely to exhibit higher self-worth, higher self-esteem, and a greater ability to cope with stress during adulthood than preschoolers who were less able to delay gratification.

Delayed Gratification and Body Mass Index

In a 2013 paper, Tanya Schlam, a doctoral student at the University of Wisconsin, and colleagues, explored a possible association between preschoolers’ ability to delay gratification and their later Body Mass Index.

Prof. Mischel’s data were again used. Of 653 preschoolers who participated in his studies as preschoolers, the researchers sent mailers to all those for whom they had valid addresses (n = 306) in December 2002 / January 2003 and again in May 2004.

Of these, 146 individuals responded with their weight and height. Individual delay scores were derived as in the 2000 Study.

Preschoolers’ ability to delay gratification accounted for a significant portion of the variance seen in the sample (p < 0.01, n = 146).

Specifically, each additional minute a preschooler delayed gratification predicted a 0.2-point reduction in BMI in adulthood.

Marshmallow Test Replication Study

In a 2018 paper, Tyler Watts, an assistant professor and postdoctoral researcher at New York University, and Greg Duncan and Haonan Quan, both doctoral students at UC, Irvine, set out to replicate longitudinal studies based on Prof. Mischel’s data.

Data on 918 individuals from a longitudinal, multi-center study on children by the National Institute of Child Health and Human Development (an institute in the NIH) were used for the study.

The sample was split into two groups –

  • Data on children of mothers who had not completed university college by the time their child was one month old (n = 552);
  • Data on children of mothers who had completed university college by that time (n = 366).

The first group (children of mothers without degrees) was more comparable to a nationally representative sample (from the Early Childhood Longitudinal Survey—Kindergarten by the National Center for Education Statistics). Even so, Hispanic children were underrepresented in the sample.

A variant of the marshmallow test was administered to children when they were 4.5 years old. An interviewer presented each child with treats based on the child’s own preferences.

Children were then told they would play the following game with the interviewer –
  • The interviewer would leave the child alone with the treat;
  • If the child waited 7 minutes, the interviewer would return, and the child would then be able to eat the treat plus an additional portion as a reward for waiting;
  • If the child did not want to wait, they could ring a bell to signal the interviewer to return early, and the child would then be able to eat the treat without an additional portion.

Delay of gratification was recorded as the number of minutes the child waited.

Academic achievement was measured at grade 1 and age 15. Measures included mathematical problem solving, word recognition and vocabulary (only in grade 1), and textual passage comprehension (only at age 15). Scores were normalized to have a mean of 100 ± 15 points.

Behavioral functioning was measured at age 4.5, grade 1, and age 15. Mothers were asked to score their child’s depressive and anti-social behaviors on 3-point Likert-scale items.

For intra-group regression analyses, the following socio-economic variables, measured at or before age 4.5, were controlled for –

  • Demographic characteristics like gender, race, birth weight, mother’s age at child’s birth, mother’s level of education, family income, mother’s score in a measure-of-intelligence test;
  • Cognitive functioning characteristics like sensory-perceptual abilities, memory, problem-solving, verbal communication skills; and
  • Home environment characteristics are known to support positive cognitive, emotional, and behavioral functioning (the HOME Inventory by Caldwell & Bradley, 1984).

marshmallow test replication results

  • Watts, Duncan, and Quan (2018) did find statistically significant correlations between early-stage ability to delay gratification and later-stage academic achievement, but the association was weaker than that found by researchers using Prof. Mischel’s data.
  • In addition, the significance of these bivariate associations disappeared after controlling for socio-economic and cognitive variables.
  • There were no statistically significant associations, even without controlling for confounding variables , between early gratification delay and later behavioral functioning at age 15.

Conclusions

These results further complicated the relationship between early delay ability and later life outcomes.

Prof. Mischel’s findings, from a small, non-representative cohort of mostly middle-class preschoolers at Stanford’s Bing Nursery School, were not replicated in a larger, more representative sample of preschool-aged children.

Increasing Delayed Gratification

The following factor has been found to increase a child’s gratification delay time –

Trust in rewarders:

Children who trust that they will be rewarded for waiting are significantly more likely to wait than those who don’t. Kidd, Palmeri, and Aslin, 2013, replicating Prof. Mischel’s marshmallow study, tested 28 four-year-olds twice.

In the first test, half of the children didn’t receive the treat they’d been promised. In the second test, the children who’d been tricked before were significantly less likely to delay gratification than those who hadn’t been tricked.

The following factors may increase an adult’s gratification delay time –

Knowledge of time-to-reward:

Individuals who know how long they must wait for an expected reward are more likely to continue waiting for said reward than those who don’t.

McGuire and Kable (2012) tested 40 adult participants. One group was given known reward times, while the other was not. The first group was significantly more likely to delay gratification.

Probability of the expected reward materializing:

When the individuals delaying their gratification are the same ones creating their reward.

For example, for someone going on a diet to achieve a desired weight, those who set realistic rewards are more likely to continue waiting for their reward than those who set unrealistic or improbable rewards.

Gelinas et al. (2013) studied the association between unrealistic weight loss expectations and weight gain before a weight-loss surgery in 219 adult participants.

The correlation coefficient r = 0.377 was statistically significant at p < 0.008 for male (n = 53) but not female (n = 166) participants.)

What is the marshmallow test?

The Marshmallow Test is a psychological experiment conducted by Walter Mischel in the 1960s. In this study, a child was offered a choice between one small reward (like a marshmallow) immediately or two small rewards if they waited for a short period, usually 15 minutes, during which the tester left the room.

What does the marshmallow test measure?

The marshmallow test measures a child’s ability to delay gratification by offering them a choice to eat a marshmallow immediately or wait for a reward for an additional marshmallow after a short period.

It assesses self-control, impulse control, and the capacity to delay instant gratification, which is connected to future success and self-regulation skills.

Ayduk, O., Mendoza-Denton, R., Mischel, W., Downey, G., Peake, P. K., & Rodriguez, M. (2000). Regulating the interpersonal self: strategic self-regulation for coping with rejection sensitivity . Journal of Personality and Social Psychology, 79 (5), 776.

Bradley, R. H., & Caldwell, B. M. (1984). The HOME Inventory and family demographics. Developmental psychology, 20 (2), 315.

Gelinas, B. L., Delparte, C. A., Hart, R., & Wright, K. D. (2013). Unrealistic weight loss goals and expectations among bariatric surgery candidates: the impact on pre-and postsurgical weight outcomes. Bariatric Surgical Patient Care, 8 (1), 12-17.

Kidd, C., Palmeri, H., & Aslin, R. N. (2013). Rational snacking: Young children’s decision-making on the marshmallow task is moderated by beliefs about environmental reliability . Cognition, 126 (1), 109-114.

McGuire, J. T., & Kable, J. W. (2012). Decision makers calibrate behavioral persistence on the basis of time-interval experience . Cognition, 124 (2), 216-226.

Mischel, W., & Ebbesen, E. B. (1970). Attention in delay of gratification . Journal of Personality and Social Psychology, 16 (2), 329.

Mischel, W., Ebbesen, E. B., & Raskoff Zeiss, A. (1972). Cognitive and attentional mechanisms in delay of gratification . Journal of personality and social psychology, 21 (2), 204.

Schlam, T. R., Wilson, N. L., Shoda, Y., Mischel, W., & Ayduk, O. (2013). Preschoolers” delay of gratification predicts their body mass 30 years later . The Journal of pediatrics, 162 (1), 90-93.

Shoda, Y., Mischel, W., & Peake, P. K. (1990). Predicting adolescent cognitive and self-regulatory competencies from preschool delay of gratification: Identifying diagnostic conditions . Developmental psychology, 26 (6), 978.

Watts, T. W., Duncan, G. J., & Quan, H. (2018). Revisiting the marshmallow test: A conceptual replication investigating links between early delay of gratification and later outcomes . Psychological science, 29 (7), 1159-1177.

Keep Learning

  • Cohort Effects in Children’s Delay of Gratification
  • Predicting adolescent cognitive and self-regulatory competencies from preschool delay of gratification: Identifying diagnostic conditions
  • Delay of Gratification as Reputation Management

Print Friendly, PDF & Email

40 Years of Stanford Research Found That People With This One Quality Are More Likely to Succeed

In the 1960s, a Stanford professor named Walter Mischel began conducting a series of important psychological studies.

During his experiments, Mischel and his team tested hundreds of children — most of them around the ages of 4 and 5 years old — and revealed what is now believed to be one of the most important characteristics for success in health, work, and life.

Let’s talk about what happened and, more importantly, how you can use it.

The Marshmallow Experiment

The experiment began by bringing each child into a private room, sitting them down in a chair, and placing a marshmallow on the table in front of them.

At this point, the researcher offered a deal to the child.

The researcher told the child that he was going to leave the room and that if the child did not eat the marshmallow while he was away, then they would be rewarded with a second marshmallow. However, if the child decided to eat the first one before the researcher came back, then they would not get a second marshmallow.

So the choice was simple: one treat right now or two treats later.

The researcher left the room for 15 minutes.

As you can imagine, the footage of the children waiting alone in the room was rather entertaining. Some kids jumped up and ate the first marshmallow as soon as the researcher closed the door. Others wiggled and bounced and scooted in their chairs as they tried to restrain themselves, but eventually gave in to temptation a few minutes later. And finally, a few of the children did manage to wait the entire time.

Published in 1972 , this popular study became known as The Marshmallow Experiment, but it wasn’t the treat that made it famous. The interesting part came years later.

The Power of Delayed Gratification

As the years rolled on and the children grew up, the researchers conducted follow up studies and tracked each child’s progress in a number of areas. What they found was surprising.

The children who were willing to delay gratification and waited to receive the second marshmallow ended up having higher SAT scores, lower levels of substance abuse, lower likelihood of obesity, better responses to stress, better social skills as reported by their parents, and generally better scores in a range of other life measures. (You can see the followup studies here , here , and here .)

The researchers followed each child for more than 40 years and over and over again, the group who waited patiently for the second marshmallow succeed in whatever capacity they were measuring. In other words, this series of experiments proved that the ability to delay gratification was critical for success in life.

And if you look around, you’ll see this playing out everywhere…

  • If you delay the gratification of watching television and get your homework done now, then you’ll learn more and get better grades.
  • If you delay the gratification of buying desserts and chips at the store, then you’ll eat healthier when you get home.
  • If you delay the gratification of finishing your workout early and put in a few more reps, then you’ll be stronger.

… and countless other examples.

Success usually comes down to choosing the pain of discipline over the ease of distraction. And that’s exactly what delayed gratification is all about.

This brings us to an interesting question: Did some children naturally have more self-control, and thus were destined for success? Or can you learn to develop this important trait?

What Determines Your Ability to Delay Gratification?

Researchers at the University of Rochester decided to replicate the marshmallow experiment, but with an important twist. (You can read the study here .)

Before offering the child the marshmallow, the researchers split the children into two groups.

The first group was exposed to a series of unreliable experiences. For example, the researcher gave the child a small box of crayons and promised to bring a bigger one, but never did. Then the researcher gave the child a small sticker and promised to bring a better selection of stickers, but never did.

Meanwhile, the second group had very reliable experiences. They were promised better crayons and got them. They were told about the better stickers and then they received them.

You can imagine the impact these experiences had on the marshmallow test. The children in the unreliable group had no reason to trust that the researchers would bring a second marshmallow and thus they didn’t wait very long to eat the first one.

Meanwhile, the children in the second group were training their brains to see delayed gratification as a positive. Every time the researcher made a promise and then delivered on it, the child’s brain registered two things: 1) waiting for gratification is worth it and 2) I have the capability to wait. As a result, the second group waited an average of four times longer than the first group.

In other words, the child’s ability to delay gratification and display self-control was not a predetermined trait, but rather was impacted by the experiences and environment that surrounded them. In fact, the effects of the environment were almost instantaneous. Just a few minutes of reliable or unreliable experiences were enough to push the actions of each child in one direction or another.

What can you and I learn from all of this?

How to Become Better at Delaying Gratification

Before we go further, let’s clear one thing up: for one reason or another, the Marshmallow Experiment has become particularly popular. You’ll find it mentioned in nearly every major media outlet. But these studies are just one piece of data, a small insight into the story of success. Human behavior (and life in general) is a lot more complex than that, so let’s not pretend that one choice a four-year-old makes will determine the rest of his or her life.

The studies above do make one thing clear: if you want to succeed at something, at some point you will need to find the ability to be disciplined and take action instead of becoming distracted and doing what’s easy. Success in nearly every field requires you to ignore doing something easier (delaying gratification) in favor of doing something harder ( doing the work and putting in your reps ).

But the key takeaway here is that even if you don’t feel like you’re good at delaying gratification now, you can train yourself to become better simply by making a few small improvements. In the case of the children in the study, this meant being exposed to a reliable environment where the researcher promised something and then delivered it.

You and I can do the same thing. We can train our ability to delay gratification, just like we can train our muscles in the gym. And you can do it in the same way as the child and the researcher: by promising something small and then delivering. Over and over again until your brain says, 1) yes, it’s worth it to wait and 2) yes, I have the capability to do this.

Here are 4 simple ways to do exactly that:

  • Start incredibly small . Make your new habit “so easy you can’t say no.” (Hat tip to Leo Babauta .)
  • Improve one thing, by one percent . Do it again tomorrow.
  • Use the “Seinfeld Strategy” to maintain consistency.
  • Find a way to get started in less than 2 minutes .

Thanks for reading. You can get more actionable ideas in my popular email newsletter. Each week, I share 3 short ideas from me, 2 quotes from others, and 1 question to think about. Over 3,000,000 people subscribe . Enter your email now and join us.

James Clear writes about habits, decision making, and continuous improvement. He is the author of the #1 New York Times bestseller, Atomic Habits . The book has sold over 20 million copies worldwide and has been translated into more than 60 languages.

Click here to learn more →

  • This Research Study Changed the Way We Think About Success (Here’s How You Can Use It)
  • Feeling Uncertain Doesn’t Make You Weak, Weird, or Unqualified
  • This is the Greatest Weightlifting Lesson I’ve Learned
  • This Coach Improved Every Tiny Thing by 1 Percent and Here’s What Happened
  • Why Trying to Be Perfect Won’t Help You Achieve Your Goals (And What Will)
  • All Articles

The Marshmallow Test: Delayed Gratification in Children

  • Archaeology

child experiment with marshmallow

  • Ph.D., Psychology, Fielding Graduate University
  • M.A., Psychology, Fielding Graduate University
  • B.A., Film Studies, Cornell University

The marshmallow test, which was created by psychologist Walter Mischel, is one of the most famous psychological experiments ever conducted. The test lets young children decide between an immediate reward, or, if they delay gratification, a larger reward. Studies by Mischel and colleagues found that children’s ability to delay gratification when they were young was correlated with positive future outcomes. More recent research has shed further light on these findings and provided a more nuanced understanding of the future benefits of self-control in childhood.

Key Takeaways: The Marshmallow Test

  • The marshmallow test was created by Walter Mischel. He and his colleagues used it to test young children’s ability to delay gratification.
  • In the test, a child is presented with the opportunity to receive an immediate reward or to wait to receive a better reward.
  • A relationship was found between children’s ability to delay gratification during the marshmallow test and their academic achievement as adolescents.
  • More recent research has added nuance to these findings showing that environmental factors, such as the reliability of the environment, play a role in whether or not children delay gratification.
  • Contrary to expectations, children’s ability to delay gratification during the marshmallow test has increased over time.

The Original Marshmallow Test

The original version of the marshmallow test used in studies by Mischel and colleagues consisted of a simple scenario. A child was brought into a room and presented with a reward, usually a marshmallow or some other desirable treat. The child was told that the researcher had to leave the room but if they could wait until the researcher returned, the child would get two marshmallows instead of just the one they were presented with. If they couldn’t wait, they wouldn’t get the more desirable reward. The researcher would then leave the room for a specific amount of time (typically 15 minutes but sometimes as long as 20 minutes) or until the child could no longer resist eating the single marshmallow in front of them.

Over six years in the late 1960s and early 1970s, Mischel and colleagues repeated the marshmallow test with hundreds of children who attended the preschool on the Stanford University campus. The children were between 3 and 5 years old when they participated in the experiments. Variations on the marshmallow test used by the researchers included different ways to help the children delay gratification, such as obscuring the treat in front of the child or giving the child instructions to think about something else in order to get their mind off the treat they were waiting for.

Years later, Mischel and colleagues followed up with some of their original marshmallow test participants. They discovered something surprising. Those individuals who were able to delay gratification during the marshmallow test as young children rated significantly higher on cognitive ability and the ability to cope with stress and frustration in adolescence. They also earned higher SAT scores.

These results led many to conclude that the ability to pass the marshmallow test and delay gratification was the key to a successful future. However, Mischel and his colleagues were always more cautious about their findings . They suggested that the link between delayed gratification in the marshmallow test and future academic success might weaken if a larger number of participants were studied. They also observed that factors like the child’s home environment could be more influential on future achievement than their research could show.

Recent Findings

The relationship Mischel and colleagues found between delayed gratification in childhood and future academic achievement garnered a great deal of attention. As a result, the marshmallow test became one of the most well-known psychological experiments in history. Yet, recent studies have used the basic paradigm of the marshmallow test to determine how Mischel’s findings hold up in different circumstances.

Delayed Gratification and Environmental Reliability

In 2013, Celeste Kidd, Holly Palmeri, and Richard Aslin published a study that added a new wrinkle to the idea that delayed gratification was the result of a child’s level of self-control. In the study, each child was primed to believe the environment was either reliable or unreliable. In both conditions, before doing the marshmallow test, the child participant was given an art project to do. In the unreliable condition, the child was provided with a set of used crayons and told that if they waited, the researcher would get them a bigger, newer set. The researcher would leave and return empty-handed after two and a half minutes. The researcher would then repeat this sequence of events with a set of stickers. The children in the reliable condition experienced the same set up, but in this case the researcher came back with the promised art supplies.

The children were then given the marshmallow test. Researchers found that those in the unreliable condition waited only about three minutes on average to eat the marshmallow, while those in the reliable condition managed to wait for an average of 12 minutes—substantially longer. The findings suggest that children’s ability to delay gratification isn’t solely the result of self-control. It’s also a rational response to what they know about the stability of their environment.

Thus, the results show that nature and nurture play a role in the marshmallow test. A child’s capacity for self-control combined with their knowledge of their environment leads to their decision about whether or not to delay gratification.

Marshmallow Test Replication Study

In 2018, another group of researchers, Tyler Watts, Greg Duncan, and Haonan Quan, performed a conceptual replication of the marshmallow test. The study wasn’t a direct replication because it didn’t recreate Mischel and his colleagues exact methods. The researchers still evaluated the relationship between delayed gratification in childhood and future success, but their approach was different. Watts and his colleagues utilized longitudinal data from the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development, a diverse sample of over 900 children.

In particular, the researchers focused their analysis on children whose mothers hadn’t completed college when they were born—a subsample of the data that better represented the racial and economic composition of children in America (although Hispanics were still underrepresented). Each additional minute a child delayed gratification predicted small gains in academic achievement in adolescence, but the increases were much smaller than those reported in Mischel’s studies. Plus, when factors like family background, early cognitive ability, and home environment were controlled for, the association virtually disappeared.

The results of the replication study have led many outlets reporting the news to claim that Mischel’s conclusions had been debunked. However, things aren’t quite so black and white. The new study demonstrated what psychologists already knew: that factors like affluence and poverty will impact one’s ability to delay gratification. The researchers themselves were measured in their interpretation of the results. Lead researcher Watts cautioned , “…these new findings should not be interpreted to suggest that gratification delay is completely unimportant, but rather that focusing only on teaching young children to delay gratification is unlikely to make much of a difference.” Instead, Watts suggested that interventions that focus on the broad cognitive and behavioral capabilities that help a child develop the ability to delay gratification would be more useful in the long term than interventions that only help a child learn to delay gratification.

Cohort Effects in Delayed Gratification

With mobile phones, streaming video, and on-demand everything today, it's a common belief that children's ability to delay gratification is deteriorating. In order to investigate this hypothesis, a group of researchers, including Mischel, conducted an analysis comparing American children who took the marshmallow test in the 1960s, 1980s, or 2000s. The children all came from similar socioeconomic backgrounds and were all 3 to 5 years old when they took the test.

Contrary to popular expectations, children’s ability to delay gratification increased in each birth cohort. The children who took the test in the 2000s delayed gratification for an average of 2 minutes longer than the children who took the test in the 1960s and 1 minute longer than the children who took the test in the 1980s.

The researchers suggested that the results can be explained by increases in IQ scores over the past several decades, which is linked to changes in technology, the increase in globalization, and changes in the economy. They also noted that the use of digital technology has been associated with an increased ability to think abstractly, which could lead to better executive function skills, such as the self-control associated with delayed gratification. Increased preschool attendance could also help account for the results.

Nonetheless, the researchers cautioned that their study wasn’t conclusive. Future research with more diverse participants is needed to see if the findings hold up with different populations as well as what might be driving the results.

  • American Psychology Association. "Can Kids Wait? Today's Youngsters May Be Able to Delay Gratification Longer Than Those of the 1960's." 25 June, 2018. https://www.apa.org/news/press/releases/2018/06/delay-gratification
  • Association for Psychological Science. "A New Approach to the Marshmallow Test Yields Complicated Findings." 5 June, 2018. https://www.psychologicalscience.org/publications/observer/obsonline/a-new-approach-to-the-marshmallow-test-yields-complex-findings.html
  • Carlson, Stephanie M., Yuichi Shoda, Ozlem Ayduk, Lawrence Aber, Catherine Schaefer, Anita Sethi, Nicole Wilson, Philip K. Peake, and Walter Mischel. "Cohort Effects in Children's Delay of Gratification." Developmental Psychology , vol. 54, no. 8, 2018, pp. 1395-1407. http://dx.doi.org/10.1037/dev0000533
  • Kidd, Celeste, Holly Palmeri, and Richard N. Aslin. "Rational Snacking: Young Children's Decision-Making on the Marshmallow Task is Moderated By Beliefs About Environmental Reliability." Cognition, vol. 126, no. 1, 2013, pp. 109-114. https://doi.org/10.1016/j.cognition.2012.08.004
  • New York University. "Professor Replicates Famous Marshmallow Test, Makes New Observations." ScienceDaily , 25 May, 2018.  https://www.sciencedaily.com/releases/2018/05/180525095226.htm
  • Shoda, Yuichi, Walter Mischel, and Philip K. Peake. "Predicting Adolescent Cognitive and Self-Regulatory Competencies from Preschool Delay of Gratification: Identifying Diagnostic Conditions." Developmental Psychology, vol. 26, no. 6, 1990, pp. 978-986. http://dx.doi.org/10.1037/0012-1649.26.6.978
  • University of Rochester. "The Marshmallow Study Revisited." 11 October, 2012. https://www.rochester.edu/news/show.php?id=4622
  • Watts, Tyler W., Greg J. Duncan, and Haonan Quan. "Revisiting the Marshmallow Test: A Conceptual Replication Investigating Links Between Early Delay of Gratification and Later Outcomes." Psychological Science, vol. 28, no. 7, 2018, pp. 1159-1177. https://doi.org/10.1177/0956797618761661
  • What Is Uses and Gratifications Theory? Definition and Examples
  • Freud: Id, Ego, and Superego Explained
  • Understanding the Big Five Personality Traits
  • What Is Socioemotional Selectivity Theory?
  • Implicit Bias: What It Means and How It Affects Behavior
  • What Is Theory of Mind in Psychology?
  • Prosopagnosia: What You Should Know About Face Blindness
  • What Is Deindividuation in Psychology? Definition and Examples
  • Definition and Examples of a Pathological Liar
  • What Is the Zone of Proximal Development? Definition and Examples
  • What Is the Premack Principle? Definition and Examples
  • What Is Object Permanence?
  • What Is Social Facilitation? Definition and Examples
  • The Montessori Method and Sensitive Periods for Learning
  • What Is Identity Diffusion? Definition and Examples
  • Myers-Briggs Personality Types: Definitions and Examples

Effectiviology

The Stanford Marshmallow Experiment: How Self-Control Affects Success in Life

The Marshmallow Experiment

The  Stanford marshmallow experiment was a psychological study conducted in the late 1960s to early 1970s , in which children were placed in a room with some tasty snack, such as a marshmallow, and told that if they could wait for a short while before eating it then they will get an extra snack as a reward. Follow-up studies on the experiment found that children’s ability to exercise self-control in this situation, by waiting before eating the snack, was correlated with a large range of positive outcomes later in life, such as academic success and physical health.

This experiment received much attention in popular media, and was used to demonstrate the importance of self-control, a concept which was supported by other studies on the topic .

However, later studies criticized the Stanford marshmallow experiment for various issues with its methodology. Furthermore, the results of a large replication study cast doubt on the predictive abilities of the marshmallow test, especially when controlling for relevant background factors such as family background and home environment.

Nevertheless, despite these criticisms, the Stanford marshmallow experiment remains of interest, due to the notable influence it had on psychological research of self-control and on people’s perception of the topic.

As such, in the following article you will learn more about the Stanford marshmallow experiment and about related research on the importance of self-control, see the main criticisms of this study, and learn how you can use a few simple techniques in order to strengthen your own self-control when necessary.

The procedure and results of the Stanford marshmallow experiment

The initial data collection for the Stanford marshmallow experiment took place between 1968 and 1972, using toddlers and preschoolers around the age of 4, who attended Stanford University’s Bing Nursery School.

The main procedure for the experiment was as follows:

  • First, a child was taken into a room and allowed to pick a snack that they would like to eat, such as a marshmallow, a pretzel, or a cookie.
  • Then, the child was then told that the researcher has to leave the room for a few minutes, and that if they could wait until the researcher came back without eating the snack, then they would get another snack of their choice as a reward.

The children’s ability to delay gratification was measured by seeing whether they were able to wait until the researcher returned without eating the snack, and if not, then by seeing how long it took before they ate the snack or called the researcher back into the room.

Even though the experiment was short and simple, the researchers found that the children’s performance on this test at an early age predicted their long-term success in various ways. Specifically, kids who were able to wait longer before eating the snack were:

  • More likely to be rated by their parents as academically and socially competent, verbally fluent, attentive, and rational, when they were older.
  • Better able to deal with frustration and stress as adolescents.
  • More likely to have  higher SAT scores as adolescents.
  • Less likely to be overweight 30 years later.

Note : the main researcher associated with the Stanford marshmallow experiment is psychologist Walter Mischel, who, together with his colleagues, published the initial studies on the experiment in 1970 and 1972 , as well as the later follow-up studies. Two other notable researchers associated with this experiment are Ebbe B. Ebbesen, who was involved with the initial studies, and Yuichi Shoda, who was involved with the follow-up studies.

Other studies using the marshmallow test

Several studies used the marshmallow test in order to examine the factors that affect children’s performance on it.

For example, studies found that trust plays a significant role in children’s decision to wait on the marshmallow task. This was the case both when it came to specific trust in the person conducting the experiment , who promised the reward to the children if they could wait, as well as when it came to children’s generalized trust in unfamiliar people . Furthermore, similar results regarding the influence of social trust were also found in delayed-gratification tests conducted on adults.

In addition, one study found that children delayed gratification for longer if they believed that members of their ingroup , which is the social group that they identify as being a part of, also waited, while members of their outgroup did not, compared to if they believed that the opposite was true.

Finally, another study compared children’s performance on the marshmallow test when it came to three birth cohorts, from the late 1960s, 1980s, and 2000s, and found that, contrary to people’s expectations, children’s ability to delay gratification has been increasing over time, a finding that has been replicated in other studies.

Other research on the importance of self-control

“People who have better control of their attention, emotions, and actions are better off almost any way you look at it. They are happier and healthier. Their relationships are more satisfying and last longer. They make more money and go further in their careers. They are better able to manage stress, deal with conflict, and overcome adversity. They even live longer. When pit against other virtues, willpower comes out on top. Self-control is a better predictor of academic success than intelligence… a stronger determinant of effective leadership than charisma…. and more important for marital bliss than empathy…” — Kelly McGonigal in “ The Willpower Instinct: How Self-Control Works, Why It Matters, and What You Can Do to Get More of It “

Other research on the topic of self-control, which used different methods than the Stanford marshmallow experiment, supports the idea that self-control, as measured early in life, is associated with a range of positive outcomes later on.

For example, one study found that childhood self-control predicts employment rates at adulthood, with individuals who are low in self-control being more likely to be unemployed.

Similarly, another study found that self-control at childhood predicts factors such as financial status, physical health, substance dependence, and criminal offending at adulthood, with higher levels of self-control leading to better outcomes. This remained the case even when the researchers controlled for background factors such as intelligence and familial socioeconomic status, though these factors did play a crucial role in children’s development. A later study replicated these findings, though its results emphasized, to a greater degree, the role that relevant background factors play in children’s development.

Furthermore, research on self-control found that this factor also plays an important role in predicting people’s success when measured directly during adulthood.

For example, a study conducted on people participating in a weight-loss program found that higher levels of self-control were associated with increased weight loss during the program, as a result of eating less and exercising more.

Similarly, a study conducted on university students showed that higher levels of self-control are correlated with “a higher grade point average, better adjustment (fewer reports of psychopathology, higher self-esteem), less binge eating and alcohol abuse, better relationships and interpersonal skills, secure attachment, and more optimal emotional responses”.

Overall, these studies, together with other studies on the topic, demonstrate that self-control measured both during childhood as well as at later stages of life , is associated with a range of positive outcomes, which suggests that it’s an important ability to have.

Related concepts and terms

The marshmallow experiment focused on people’s ability to delay gratification, a facet of self-control that’s sometimes referred to as “patience”. However, the experiment has been found to be a good predictor of self-control in general, meaning that it can be used to predict people’s ability to exercise control in other ways, such as by bringing themselves to do something that they feel anxious about.

In general, self-control is crucial to people’s ability to self-regulate their behavior in pursuit of their goals. This ability is also affected by their executive functions , which are the cognitive processes and abilities, such as task-switching and behavioral inhibition, that are used to control one’s behavior.

A notable, related concept in psychology is conscientiousness , which is the trait of being disciplined, achievement-oriented, organized, and focused, since this trait is one of the strongest predictors of people’s ability to delay gratification.

Note: the term ‘willpower’ is  sometimes used in place of the term ‘self-control’, though it’s also possible to view willpower as something that people use while they’re exercising self-control.

Criticism and replications of the Stanford marshmallow experiment

Though the Stanford marshmallow experiment gained much positive attention in the research community and the press, it has also been heavily criticized by various groups. The main criticisms of the Stanford marshmallow experiment include the following :

  • The initial sample for the experiment was highly selective, as it consisted of children from the Stanford University community.
  • The samples used in the longitudinal studies on the experiment were small and even more selective than the initial sample, since they contained only the children examined in the original experiment that the researchers were able to reach later.
  • The analyses of the data didn’t always account for potential confounding factors, such as family socioeconomic status and general cognitive abilities.

In light of these criticisms, a large replication study was conducted to assess the validity of the findings from the Stanford marshmallow experiment. This replication examined how well preschooler’s ability to delay gratification on the marshmallow test predicted a variety of academic and behavioral outcomes at age 15.

The researchers considered their study to be “a conceptual, rather than traditional, replication of Mischel and Shoda’s seminal work”, since there were some notable differences between their replication and the original work on the topic. These differences included a larger sample, a focus on children born to mothers who had not completed college, and the use of a modified version of the original marshmallow experiment.

The replication did find that the ability to delay gratification at the age of 4 predicted increased achievement at the age of 15. However, the effect size of this association was only half as big as in the original studies, and was reduced by two thirds when the researchers controlled for relevant factors, such as family background, home environment, and early cognitive ability.

Furthermore, the researchers found that most of the achievement boost from the early ability to delay gratification came from the ability to wait for only 20 seconds. This calls into question the hypothesis proposed by the original researchers, that the relationship between the ability to delay gratification and later academic achievement is driven primarily by the ability to utilize relevant metacognitive strategies, since such strategies are unlikely to have played a significant role in children’s ability to wait only 20 seconds.

The findings of this replication were supported by another replication , which found that the ability to delay gratification at age 4.5 did  not predict children’s academic achievement at age 15, once relevant background variables were controlled for.

In addition, a  different replication of the original study , which followed the original protocol more closely but used a smaller sample, found that the ability to delay gratification at the age of 4 did  not predict children’s performance, more than a decade later, at a task requiring cognitive control. However, the children’s ability to direct their attention away from the rewarding stimuli was associated with increased efficiency at the task, in terms of being able to perform it at greater speed without reduced accuracy.

Moreover, a follow-up study on the original sample from the Stanford marshmallow experiment found that there is no significant relationship between people’s delay of gratification at preschool age and their economic outcomes in their late 40s. Nevertheless, the study did find that there is an association between more comprehensive measures of self-regulation at later ages and people’s economic outcomes in their 40s.

Finally, however, it’s important to note that some of the research criticizing the Stanford marshmallow experiment has also been criticized in itself. For example, the main replication on the topic has been criticized for various reasons , as evident, for instance, in a paper on the topic , which argues that “many of the variables in their models should not have been included as confounds because they likely captured factors that measure fundamental processes supporting delay of gratification”.

Overall, the criticisms and replications of the Stanford marshmallow experiment cast doubt on its validity. Nevertheless, given the large body of supporting evidence on the topic, research suggests that self-control does play an important role when it comes to success in life, both when measured during childhood as well as when measured during adulthood. This suggests that the main issues with the marshmallow experiment are its methodology, which is simple and appealing, but not sufficiently robust.

The cognitive mechanisms of self-control

Based on the findings of the Stanford marshmallow experiment, researchers suggest  that we engage two cognitive systems when faced with a situation that requires self-control:

  • Hot system. The hot system is our impulsive, emotional system. Hot behaviors, which rely on this system, include things such as fixating on rewards (e.g. imagining what a marshmallow will taste like). These behaviors undermine our self-control, and make it more difficult for us to resist temptation.
  • Cool system. The cool system is our rational, emotionally-neutral system. Cool  strategies, which rely on this system, include things such as successful self-distraction (e.g. playing a game which is unrelated to potential temptations). These strategies help us exercise self-control, and successfully delay gratification.

Based on these mechanisms, we can say that our self-control is affected by our ability to inhibit the occurrence of hot behaviors, by utilizing cool  strategies.

Lessons from the marshmallow experiment on exercising self-control

Though the marshmallow test is primarily known for illustrating the importance of self-control, it also provides several insights into how people can learn to better exercise their self-control.

For example,  one of the original studies on the Stanford marshmallow experiment describes several factors that affected the children’s ability to exercise self-control during the test:

  • Children who were told to distract themselves by playing with a toy or by thinking about playing with one were able to delay gratification for longer.
  • Children who were told to think about “fun things” were able to wait for significantly longer than those who were told to think “sad thoughts”.
  • Children who were told to spend their time thinking about the rewards of the test generally struggled to delay gratification.

Furthermore, the studies on the topic also demonstrate how the children coped with temptation, even when they weren’t instructed how to do so by the researchers. As the first study on the topic states:

“One of the most striking delay strategies used by some subjects was exceedingly simple and effective. These children seemed to facilitate their waiting by converting the aversive waiting situation into a more pleasant nonwaiting one. They devised elaborate self-distraction techniques through which they spent their time psychologically doing something (almost anything) other than waiting. Instead of focusing prolonged attention on the objects for which they were waiting, they avoided looking at them. Some children covered their eyes with their hands, rested their heads on their arms, and found other similar techniques for averting their eyes from the reward objects. Many seemed to try to reduce the frustration of delay of reward by generating their own diversions: they talked to themselves, sang, invented games with their hands and feet, and even tried to fall asleep while waiting—as one child successfully did… These observations, while obviously inconclusive, suggest that diverting one’s attention away from the delayed reward (while maintaining behavior directed toward its ultimate attainment) may be a key step in bridging temporal delay of reward. That is, learning not to think about what one is awaiting may enhance delay of gratification, much more than does ideating about the outcomes.”

This means that, in order to help yourself exercise self-control in the face of temptation, you want to avoid obsessing about the potential reward that you’re tempted by or fixating on the difficulty of resisting it. Instead, as soon as you recognize yourself starting to fall into one of these negative thought patterns, you need to mentally “exit” it as quickly as possible.

You can do this by distracting yourself and taking part in unrelated positive experiences, such as reading a book, playing a game, or talking to a friend. The more positive the experience, and the more it can distract you from the potential reward, the more it will help you exercise restraint and self-control.

This may sound difficult to accomplish, but studies show that self-control training can be beneficial  in the long term , and that you can strengthen your self-control through the regular practice of small acts of self-control. As the main book on the topic states:

“…the ability to delay immediate gratification for the sake of future consequences is an acquirable cognitive skill.” — Walter Mischel in “ The Marshmallow Test: Why Self-Control Is the Engine of Success “

This is important, since it means that doing something such as reducing your snacking behavior can later help you exercise self-control in unrelated areas, such as pushing yourself at the gym or fighting against your procrastination tendencies when it comes to doing work.

Note : the book written about the marshmallow test discusses other techniques that you can use to improve your self-control, such as increasing your connection to your future self and creating if-then implementation plans .

Summary and conclusions

  • The  Stanford marshmallow experiment was a psychological study conducted in the late 1960s to early 1970s, in which children were placed in a room with some tasty snack, such as a marshmallow, and told that if they could wait for a short while before eating it then they will get an extra snack as a reward.
  • Follow-up studies on the experiment found that children’s ability to exercise self-control in this situation, by waiting before eating the snack, was correlated with a large range of positive outcomes later in life, such as academic success and physical health.
  • The validity of the marshmallow experiment has been questioned by a number of studies, but also supported by related research on the topic, and overall, it appears that while the marshmallow test is flawed in some ways, self-control nevertheless plays an important role in people’s development.
  • The researchers who conducted the Stanford marshmallow experiment suggested that the ability to delay gratification depends primarily on the ability to engage our cool , rational cognitive system, in order to inhibit our hot , impulsive system.
  • Therefore, to improve your ability to exercise self-control, you can focus on using relevant cool strategies, such as distracting yourself from tempting rewards, in order to inhibit hot behaviors, such as obsessing about the difficulty of resisting a certain temptation.

If you found this concept interesting and you want to learn more about it, read the main book on the topic, which was written by the primary researcher involved with the study: “ The Marshmallow Test: Why Self-Control Is the Engine of Success “.

Other articles you may find interesting:

  • Authority Bias: Lessons from the Milgram Obedience Experiment
  • The Empathy Gap: Why People Fail to Understand Different Perspectives
  • The Napoleon Technique: Postponing Things to Increase Productivity

child experiment with marshmallow

Science Behind Kids and Marshmallows

German and Cameroonian kids were part of an experiment based on the classic "marshmallow test": Put a single treat before a child but tell the child if he or she waits, say, 10 minutes, a second treat will be given.

  by Michaeleen Doucleff, NPR In the the 1960s, a Stanford psychologist ran an experiment to study children's self-control.It's called the marshmallow test. And it's super simple.Kids ages 3 to 5 choose a treat — an Oreo cookie, a pretzel stick or a marshmallow. Then researchers give the child brief instructions: You can eat the treat now, but if you can wait for me to return, you'll get two treats.The researchers leave the room. And the child just has to sit there staring at a marshmallow — and deciding whether to exert self-control or to dig in.Psychologists have performed the experiment many times. In general, fewer than half the kids "pass" the test. Most kids can't delay gratification: They gobble up the marshmallow.The researchers followed these kids for decades and found that those who did wait were more likely to have better SAT scores and better jobs later on in life, the researchers reported.Now for the first time, there's a study reporting on what happens when psychologists give the marshmallow test to kids outside Western culture, specifically 4-year-old children from the ethnic group Nso in Cameroon."The Nso are a community who live off subsistence farming, mainly corn and beans," says Bettina Lamm, a psychologist at the Universitaet Osnabrueck, who led the study. "Most of the children live in mud brick houses without water and electricity. They have to work a lot to take care of younger siblings and help their parents on the farm."Guess what? These kids rocked the marshmallow test."The difference was huge," Lamm says. "The Cameroonian kids really behave very differently, and they were able to wait much better."Lamm and her colleagues ran the experiment on nearly 200 Cameroonian and German kids. The Cameroonian kids were offered a puff-puff — a little doughnut popular there.Compared to German children in the experiment, the Cameroonian kids waited, on average, twice as long for the second treat. And way more Cameroonian kids — nearly 70 percent — waited the full 10 minutes to snag the second marshmallow. Only about 30 percent of the German kids could hold out, Lamm and her team reported in the journal Child Development in early June.Lamm and colleagues watched the kids closely during the experiment and tracked their behavior. The German kids showed way more emotions during the waiting period, she says, especially negative emotions.They whined more, cried more and squirmed more in their seats."They were really fighting the waiting period — playing with their fingers, talking to themselves — trying any distraction they can to manage to wait," she says."In contrast, many of the Cameroonian kids simply sat quietly and waited," Lamm says. "Ten percent of them even fell asleep."It's almost like the kids were meditating.Lamm says they don't know exactly why the Cameroonian kids were so good at the marshmallow test. Kid behaviors are complicated and sophisticated. But one reason may be the Nso parenting style, which is completely different than Western parenting."Nso children are required very early to control their emotions, especially negative emotions," Lamm says. "Moms tell their children that they don't expect them to cry and that they really want them to learn to control their emotions."This parenting style starts very early — when children are newborns."The moms breast-feed their babies before they start to cry so they don't need to express any negative emotion," Lamm says. "This emotion is already regulated before it's expressed."Western moms spend a lot of time looking at their babies for signals to figure out what their babies need. The Nso moms don't do this."They believe they — the moms — know what is good for a baby, and they do what is good for a baby," Lamm says. "They don't need to look for signals from the baby."As the children get older, this parenting style continues."Kids are really expected to learn to control their needs and not ask for their desires or wishes," she says.These moms expect obedience and respect. And the kids are good at adapting to situations where they don't get what they might want, where they have to wait."So they learn self-control," Lamm says. "Maybe it's a different type of self-control than Western kids learn. But it's very effective."But self-control might not be the whole picture, says Celeste Kidd, a neuroscientist at the University of Rochester.A few years ago, she and her colleagues found evidence that the marshmallow test doesn't involve just self-control. It also measures how much a child trusts his environment — how much the child trusts that the researcher is really going to return with the second marshmallow."What also matters is kids' expectations about whether waiting will be worth it or not," Kidd says.Because if you think about it, if the child doesn't believe that second marshmallow is actually going to arrive, then it makes sense to eat the first one."We have evidence that kids take under consideration the statistical nature of what has happened in the past," Kidd says. "So for example, if a child is living in an environment where there's a lot of uncertainty and instability, then they may think that waiting isn't likely to pay off even though they have the ability to delay gratification."Nso parents are quite strict and consistent with their discipline. So, Kidd says, perhaps this consistent parenting boosts the kids' trust in adults — that when the person in charge says they're going to bring a second treat, they likely will.

What the Marshmallow Test Really Teaches About Self-Control

One of the most influential modern psychologists, Walter Mischel, addresses misconceptions about his study, and discusses how both adults and kids can master willpower.

child experiment with marshmallow

The image is iconic: A little kid sits at a table, his face contorted in concentration, staring down a marshmallow. Over the last 50 years, the “Marshmallow Test” has become synonymous with temptation, willpower, and grit. Walter Mischel’s work permeates popular culture. There are “Don’t Eat the Marshmallow!” t-shirts and Sesame Street episodes where Cookie Monster learns delayed gratification so he can join the Cookie Connoisseurs Club. Investment companies have used the Marshmallow Test to encourage retirement planning. And when I mentioned to friends that I was interviewing the Marshmallow Man about his new book, The Marshmallow Test: Mastering Self-Control , nobody missed the reference.

It began in the early 1960s at Stanford University’s Bing Nursery School, where Mischel and his graduate students gave children the choice between one reward (like a marshmallow, pretzel, or mint) they could eat immediately, and a larger reward (two marshmallows) for which they would have to wait alone, for up to 20 minutes. Years later, Mischel and his team followed up with the Bing preschoolers and found that children who had waited for the second marshmallow generally fared better in life. For example, studies showed that a child’s ability to delay eating the first treat predicted higher SAT scores and a lower body mass index (BMI) 30 years after their initial Marshmallow Test. Researchers discovered that parents of “high delayers” even reported that they were more competent than “instant gratifiers”—without ever knowing whether their child had gobbled the first marshmallow.

But there’s been criticism of Mischel’s findings too—that his samples are too small or homogenous to support sweeping scientific conclusions and that the Marshmallow Test actually measures trust in authority, not what he says his grandmother called sitzfleisch, the ability to sit in a seat and reach a goal, despite obstacles. I met with Mischel in his Upper West Side home, where we discussed what the Marshmallow Test really captures, how schools can use his work to help problem students, why men like Tiger Woods and President Bill Clinton may have suffered “willpower fatigue”—and whether I should be concerned that my five-year old devoured “the marshmallow” (in his case, a small chocolate cupcake) in 30 seconds.

Jacoba Urist: I have to tell you right off, my son is in kindergarten and he flunked the Marshmallow Test last night.

Walter Mischel: First, it’s important that I say “the test” in quotes, because it didn’t start out as a “test” but a situation where we were studying the kinds of things that kids did naturally to make self-control easier or harder for them. Four-year-olds can be brilliantly imaginative about distracting themselves, turning their toes into piano keyboards, singing little songs, exploring their nasal orifices.

Urist: The problem is, I think he has no motivation for food. In our house, dessert isn’t a big deal. Could the kids who wait for the marshmallow just not care that much about treats? Maybe their families didn’t use food as a reward system so they didn’t respond to it as a motivator?

Mischel: You have to understand, in the studies we did, the marshmallows are not the ones presented in the media and on YouTube or on the cover of my book. They were these teeny, weeny pathetic miniature marshmallows or the difference between one tiny, little pretzel stick and two little pretzel sticks, less than an inch tall. It’s really not about candy. Many of the kids would bag their little treats to say, “Look what I did and how proud mom is going to be.” The studies are about achievement situations and what influences a child to reach his or her choice. In some cases, we even used two colored poker chips versus one.

Urist: How important is trust then? Some critics claim that a 2012 University of Rochester study calls the Marshmallow Test into question. Children in a reliable environment (where they could trust that the delayed reward would materialize) waited four times longer than children in the unreliable group. Were the kids in your test simply making a rational choice and assessing reliability? And wouldn’t that factor be outside the scope of the original Marshmallow Tests?

Mischel: This is another thing the media regularly misses. Before the marshmallow experiments, I researched trust in decision-making for adults and children. Trust is a tremendous issue. Therefore, in the Marshmallow Tests, the first thing we do is make sure the researcher is someone who is extremely familiar to the child and plays with them in the playroom before the test. It’s also important to realize, it’s not a matter of if somebody will come back with the two little marshmallows. They are all right there on the tray. It’s all out in the open, so there’s no trust issue about whether the marshmallows are real.

Urist: When it comes to correlations between the Marshmallow Test and indicators of success later in life, some people say the marshmallow tests are based on too small a sample to draw meaningful conclusions, that you originally studied over 500 children, but you only tracked down 94 of the participants’ SAT scores?

Mischel : We didn’t want parental reports of SAT scores. We actually wanted to be able to contact the organization that administered the SAT at the time and therefore had to use a subset of the children. But the correlations were sufficiently strong that the smaller sample size isn’t relevant. To me, the real problem was that we were dealing with an incredibly homogenous sample, either children of Stanford faculty or Stanford graduate students—and we still saw strong correlation. But it was an unbelievably elitist subset of the human race, which was one of the concerns that motivated me to study children in the South Bronx—kids in high-stress, poverty conditions—and yet we saw many of the same phenomena as the marshmallow studies were revealing.

Urist: Are some children who delay responding to authority? Could waiting be a sign of wanting to please an adult and not a proxy for innate willpower? Presumably, even little kids can glean what the researchers want from them.

Mischel : Maybe. They might be responding to anything under the sun. But it’s how they respond. The most interesting thing, I think, about the studies is not the correlations that the press picks up, but that the marshmallow studies became the basis for testing all kinds of adults and how adults deal with difficult emotions that are very hard to distance yourself from, like heartbreak or grief.

[Ed. note: Mischel’s book draws on the marshmallow studies to explore how adults can master the same cognitive skills that kids use to distract themselves from the treat, when they encounter challenges in everyday life, from quitting smoking to overcoming a difficult breakup.]

Recommended Reading

The silhouette of a man doing a pull-up on a beach in the Philippines.

Lift Weight, Not Too Much, Most of the Days

An illustration of an older person looking at an owl sitting on their shoulder

The Kind of Smarts You Don’t Find in Young People

child experiment with marshmallow

Photos of Abandoned Russia

Urist: I have to ask you about President Clinton and Tiger Woods, both mentioned in the book. I’ve heard of “decision fatigue” —are their respective media scandals both examples of adults who suffered from “willpower fatigue?” Men who could exercise enormous self-discipline on the golf course or in the Oval office but less so personally?

Mischel: No question. People experience willpower fatigue and plain old fatigue and exhaustion. What we do when we get tired is heavily influenced by the self-standards we develop and that in turn is strongly influenced by the models we have. Bill Clinton simply may have a different sense of entitlement: I worked hard all day, now I’m entitled to X, Y, or Z. Confusion about these kinds of behaviors [tremendous willpower in one situation, but not another] is erased when you realize self-control involves cognitive skills. You can have the skills and not use them. If your kid waits for the marshmallow, [then you know] she is able to do it. But if she doesn’t, you don’t know why. She may have decided she doesn’t want to.

Urist: So for adults and kids, self-control or the ability to delay gratification is like a muscle? You can choose to flex it or not?

Mischel: Yes, absolutely. That’s a perfectly reasonable analogy.

Urist : In the book, you advise parents if their child doesn’t pass the Marshmallow Test, ask them why they didn’t wait. What should I be trying to elicit from my son about why he grabbed the first little cupcake? When I asked, he just shrugged and said, “I don’t know.”

Mischel : It sounds like your son is very comfortable with cupcakes and not having any cupcake panics and I wish him a hearty appetite. Whether the information is relevant in a school setting depends on how the child is doing in the classroom. If he or she is doing well, who cares? But if the child is distracted or has problems regulating his own negative emotions, is constantly getting into trouble with others, and spoiling things for classmates, what you can take from my work and my book, is to use all the strategies I discuss—namely making “if-then” plans and practicing them. Having a whole set of procedures in place can help a child regulate what he is feeling or doing more carefully.

Urist: One last question. After all these years, why a book now?

Mischel : Well, there are two reasons. First, so much research has exploded on executive function and there have been so many breakthroughs in neuroscience on how the brain works to make it harder or easier to exercise self-control. It’s an enormously exciting time within science for understanding in a much deeper way the relationships between mind, brain, and behavior and to ask the important questions: How can you regulate yourself and control yourself in ways that make your life better? Second, there have been so many misunderstandings about what the Marshmallow Test does and doesn’t do, what the lessons are to take from it, that I thought I might as well write about this rather than have arguments in the newspapers.

About the Author

More Stories

A Contemporary Artist Is Helping Princeton Confront Its Ugly Past

How Do You Conserve Art Made of Bologna, or Bubble Gum, or Soap?

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Psychol Sci

Revisiting the Marshmallow Test: A Conceptual Replication Investigating Links Between Early Delay of Gratification and Later Outcomes

Tyler w. watts.

1 Steinhardt School of Culture, Education, and Human Development, New York University

Greg J. Duncan

2 School of Education, University of California, Irvine

Haonan Quan

Associated data.

Marshmallow Data Set Up, 1._Marshmallow_Data_Set_Up for Revisiting the Marshmallow Test: A Conceptual Replication Investigating Links Between Early Delay of Gratification and Later Outcomes by Tyler W. Watts, Greg J. Duncan, and Haonan Quan in Psychological Science

Marshmallow Analysis, 2._Marshmallow_Analysis for Revisiting the Marshmallow Test: A Conceptual Replication Investigating Links Between Early Delay of Gratification and Later Outcomes by Tyler W. Watts, Greg J. Duncan, and Haonan Quan in Psychological Science

Table 1 Set Up, 3._Table_1_Set_Up for Revisiting the Marshmallow Test: A Conceptual Replication Investigating Links Between Early Delay of Gratification and Later Outcomes by Tyler W. Watts, Greg J. Duncan, and Haonan Quan in Psychological Science

Marshmallow Analysis README, MarshmallowAnalysisREADMEFinal for Revisiting the Marshmallow Test: A Conceptual Replication Investigating Links Between Early Delay of Gratification and Later Outcomes by Tyler W. Watts, Greg J. Duncan, and Haonan Quan in Psychological Science

Open Practices Disclosure, WattsOpenPracticesDisclosure for Revisiting the Marshmallow Test: A Conceptual Replication Investigating Links Between Early Delay of Gratification and Later Outcomes by Tyler W. Watts, Greg J. Duncan, and Haonan Quan in Psychological Science

Supplemental Material, WattsSupplementalMaterial for Revisiting the Marshmallow Test: A Conceptual Replication Investigating Links Between Early Delay of Gratification and Later Outcomes by Tyler W. Watts, Greg J. Duncan, and Haonan Quan in Psychological Science

We replicated and extended Shoda, Mischel, and Peake’s (1990) famous marshmallow study, which showed strong bivariate correlations between a child’s ability to delay gratification just before entering school and both adolescent achievement and socioemotional behaviors. Concentrating on children whose mothers had not completed college, we found that an additional minute waited at age 4 predicted a gain of approximately one tenth of a standard deviation in achievement at age 15. But this bivariate correlation was only half the size of those reported in the original studies and was reduced by two thirds in the presence of controls for family background, early cognitive ability, and the home environment. Most of the variation in adolescent achievement came from being able to wait at least 20 s. Associations between delay time and measures of behavioral outcomes at age 15 were much smaller and rarely statistically significant.

In a series of studies based on children who attended a preschool on the Stanford University campus, Mischel, Shoda, and colleagues showed that under certain conditions, a child’s success in delaying the gratification of eating marshmallows or a similar treat was related to later cognitive and social development, health, and even brain structure ( Casey et al., 2011 ; Mischel et al., 2010 ; Shoda, Mischel, & Peake, 1990 ). Although only part of a larger research program investigating how children develop self-control, Mischel and Shoda’s delay-time–later-outcome correlations and the preschooler videos accompanying them have become some of the most memorable findings from developmental research. Gratification delay is now viewed by many to be a fundamental “noncognitive” skill that, if developed early, can provide a lifetime of benefits (see Mischel et al., 2010 , for a review).

Since the publication of Mischel and Shoda’s seminal studies (e.g., Mischel, Shoda, & Peake, 1988 ; Mischel, Shoda, & Rodriguez, 1989 ; Shoda et al., 1990 ), other researchers have examined the processes underlying the ability to delay gratification. Some have modified the marshmallow test to illuminate the factors that affect a child’s ability to delay gratification (e.g., Imuta, Hayne, & Scarf, 2014 ; Kidd, Palmeri, & Aslin, 2013 ; Michaelson & Munakata, 2016 ; Rodriguez, Mischel, & Shoda, 1989 ; Shimoni, Asbe, Eyal, & Berger, 2016 ); others have investigated the cognitive and socioemotional correlates of gratification delay (e.g., Bembenutty & Karabenick, 2004 ; Duckworth, Tsukayama, & Kirby, 2013 ; Romer, Duckworth, Sznitman, & Park, 2010 ). These studies have added to a growing body of literature on self-control suggesting that gratification delay may constitute a critical early capacity. For example, Moffitt and Caspi demonstrated that self-control—typically understood to be an umbrella construct that includes gratification delay but also impulsivity, conscientiousness, self-regulation, and executive function—averaged across early and middle childhood, predicted outcomes across a host of adult domains ( Moffitt et al., 2011 ). Duckworth and colleagues (2013) showed that the relation between early gratification delay and later outcomes was partially mediated by a composite measure of self-control, which has further fueled interventions designed to promote skills that fall under the “self-control” umbrella (e.g., Diamond & Lee, 2011 ). However, despite the proliferation of work on gratification delay, and the related construct of self-control, Mischel and Shoda’s longitudinal studies still stand as the foundational examinations of the long-run correlates of the ability to delay gratification in early childhood.

Revisiting these studies reveals several limiting factors that warrant further investigation. First, Mischel and Shoda’s reported longitudinal associations were based on very small and highly selective samples of children from the Stanford University community ( n s = 35–89; Mischel et al., 1988 ; Mischel et al., 1989 ; Shoda et al., 1990 ). Although Mischel’s original work included over 600 preschool-age children ( Shoda et al., 1990 ), follow-up investigations focused on much smaller samples (e.g., for their investigation of SAT and behavioral outcomes, Shoda and colleagues were able to contact only 185 of the original 653 children). Moreover, these children originally underwent variations of the gratification-delay assessment; Mischel experimented with trials in which the treat was obscured from a child’s vision, and some of the children were supplied with coping strategies to help them delay longer. They found positive associations between gratification delay and later outcomes only for children participating in trials in which no strategy was coached and the treat was clearly visible—a circumstance they called the “diagnostic condition.”

For the 35 to 48 children who were tested in the diagnostic condition, and for whom adolescent follow-up data were available, Shoda and colleagues (1990) observed large correlations between delay time and SAT scores, r (35) = .57 for math, r (35) = .42 for verbal, and between delay time and parent-reported behaviors, for example, “[my child] is attentive and able to concentrate,” r (48) = .39. These bivariate correlations were not adjusted for potential confounding factors that could affect both early delay ability and later outcomes. Because these findings have been cited as motivation both for interventions designed to boost gratification delay specifically (e.g., Kumst & Scarf, 2015 ; Murray, Theakston, & Wells, 2016 ; Rybanska, McKay, Jong, & Whitehouse, 2017 ) and for interventions seeking to promote self-control more generally (e.g., Diamond & Lee, 2011 ; Flook, Goldberg, Pinger, & Davidson, 2015 ; Rueda, Checa, & Cómbita, 2012 ), it is important to consider possible confounding factors that might lead bivariate correlations to be a poor projection of likely intervention effects.

In the current study, we pursued a conceptual replication of Mischel and Shoda’s original longitudinal work. Specifically, we examined associations between performance on a modified version of the marshmallow test and later outcomes in a larger and more diverse sample of children, and we employed empirical methods that adjusted for confounding factors inherent in Mischel and Shoda’s bivariate correlations. Several considerations motivated our effort. First, replication is a staple of sound science ( Campbell, 1986 ; Duncan, Engel, Claessens, & Dowsett, 2014 ). Second, Mischel and Shoda’s highly selective sample of children limits the generalizability of their results. Finally, if researchers are to extend Mischel and Shoda’s work to develop interventions, a more sophisticated examination of the long-run correlates of early gratification delay is needed. Interventions that successfully boost early delay ability might have no effect on later life outcomes if associations between gratification delay and later outcomes are driven by factors unlikely to be altered by child-focused programs (e.g., socioeconomic status [SES], home parenting environment).

Current Study

We used data from the National Institute of Child Health and Human Development (NICHD) Study of Early Child Care and Youth Development (SECCYD) to explore associations between preschoolers’ ability to delay gratification and academic and behavioral outcomes at age 15. We focused most of our analysis on a sample of children born to mothers who had not completed college, for two reasons. First, it allowed us to investigate whether Mischel and Shoda’s longitudinal findings extend to populations of greater interest to researchers and policymakers concerned with developing interventions (e.g., Mischel, 2014 ). Second, empirical concerns over the extent of truncation in our key gratification-delay measure in the college-educated sample limited our ability to reliably assess the correlation between gratification delay and later abilities. Because of these differences, we consider our study to be a conceptual, rather than traditional, replication of Mischel and Shoda’s seminal work ( Robins, 1978 ).

More complete information regarding the study data and measures can be found in the Supplemental Material available online. Here, we provide a brief overview of key study components.

Data for the current study were drawn from the NICHD SECCYD, a widely used data set in developmental psychology ( NICHD Early Child Care Research Network, 2002 ). Participants were recruited at birth from 10 U.S. sites across the country, providing a geographically diverse, although not nationally representative, sample of children and mothers. Participants have been followed across childhood and adolescence, with the last full round of data collection occurring when children were 15 years old.

The current study relied on data collected when children were 54 months of age, and our outcome variables were measured during the assessments at Grade 1 and age 15. Our analysis sample was limited to children who had a valid measure of delay of gratification at age 54 months, as well as nonmissing achievement and behavioral data at age 15 ( n = 918). For conceptual and analytic reasons (detailed below), we then split our sample on the basis of mother’s education, and we focused much of our analyses on children whose mothers did not report having completed college when the child was 1 month old ( n = 552, a sample that is 10 times larger than the sample size in the Shoda et al., 1990 , study).

In Table 1 , we present selected demographic characteristics for children included in our analytic sample, split by whether the child’s mother did or did not receive a bachelor’s degree. For purposes of comparison, we also present the same set of characteristics for a nationally representative sample of kindergarteners collected 2 to 3 years after our sample’s 54-month wave of data collection. These nationally representative data were drawn from the publically available Early Childhood Longitudinal Survey—Kindergarten Cohort, 1998–1999 ( https://nces.ed.gov/ecls/dataproducts.asp ; more information regarding this data set can be found in the Supplemental Material ).

Demographic Comparisons Between the Analytic Samples and a Nationally Representative Sample of Kindergarten Children (ECLS-K, 1998)

VariableNICHD SECCYD ECLS-K, 1998
Children of nondegreed mothersChildren of degreed mothersNationally representative sample
Proportion male.49.46.51
Proportion Black.16.02.16
Proportion Hispanic.07.03.19
Proportion White.73.91.57
Mean age of mother (in years) at child’s birth26.84 (5.61)31.67 (4.01)27.28 (6.61)
Mother’s education (proportions)
 Did not complete high school.14.00.14
 Graduated from high school.32.00.29
 Some college.54.00.33
 Bachelor’s degree or higher.001.00.23
Income-to-needs ratio
 ≤ 10.1800.17
 > 1 to ≤ 20.270.050.26
 > 2 to ≤ 30.250.190.16
 > 3 to ≤ 40.150.210.16
 > 40.150.550.24
Proportion of mothers unemployed.29.23.32
Mean number of children in home2.32 (1.03)2.16 (0.83)2.49 (1.16)
Proportion of mothers married.67.93.70
Number of observations55236621,242

Note: Standard deviations are given in parentheses. The Early Childhood Longitudinal Survey—Kindergarten (ECLS-K) estimates were derived from data made publically available by the National Center for Education Statistics ( https://nces.ed.gov/ecls/dataproducts.asp ). All ECLS-K measures shown were collected during the fall of kindergarten (i.e., 1998), and National Institute of Child Health and Human Development (NICHD) Study of Early Child Care and Youth Development (SECCYD) measures were collected during the 54-month interview (i.e., preschool; 1995–1996), except for mother’s education and mother’s age at child’s birth, which were both collected at the 1-month interview. The ECLS-K variables were weighted using the C1CW0 weight to generate nationally representative estimates.

The children of college-completing mothers were largely White (91%), with 55% of them reporting family income that was at least 4 times above the poverty line (i.e., income-to-needs ratio over 4.0) and none of them reporting income at or below the poverty line (i.e., income-to-needs ratio at or below 1.0). The subsample of children with mothers without a college degree was more comparable with the nationally representative sample. In both samples, about 16% of children were Black, mother’s age at birth was approximately 27 years, 14% of mothers did not complete high school, and between 17% and 18% of families were living at or below the poverty line. However, Hispanic children were still underrepresented in this sample, underscoring the fact that although diverse, our data were not nationally representative.

Delay of gratification

A variant of Mischel’s (1974) self-imposed waiting task (i.e., the “marshmallow test”) was administered to children when they were 54 months old. An interviewer would present children with an appealing edible treat based on the child’s own stated preferences (e.g., marshmallows, M&M’s, animal crackers). Children were then told that they would engage in a game in which the interviewer would leave the child alone in a room with the treat. If the child waited for 7 min, the interviewer would return, and the child could eat the treat and receive an additional portion as a reward for waiting. Children who chose not to wait could ring a bell to signal the experimenter to return early, and they would then receive only the amount of candy originally presented. The measure of delay of gratification was then recorded as the number of seconds the child waited, with 7 min being the ceiling.

The measure of gratification delay used here differed from the one employed by Mischel (1974) in several noteworthy ways. First, the 7-min cap was much shorter than Mischel’s maximum assessment length; the children in Mischel’s sample were asked to wait between 15 and 20 min, depending on the study, before the assessment ended. In our sample, approximately 55% of children hit the 7-min ceiling on the measure, presenting a potential analytic challenge to our models. However, we found that the ceiling was much more problematic for higher- than lower-SES children. Children whose mothers obtained college degrees hit the ceiling at a rate of 68%, compared with 45% for children whose mothers did not complete college ( p < .001; see Table 2 ).

Descriptive Characteristics of Key Analysis Variables

VariableChildren of nondegreed mothers
( = 552)
Children of degreed mothers
( = 366)
β value for difference
Delay of gratification (minutes waited)3.99 (3.08)5.38 (2.62)0.45.001
Delay of gratification (categories)
 7 min.45.680.21.001
 2–7 min.16.12−0.02.324
 0.333–2 min.16.10−0.06.012
 < 0.333 min.23.10−0.13.001
Outcome measures: Grade 1
 Achievement composite108.42 (13.71)117.29 (13.47)0.63.001
 Behavior composite49.15 (8.43)47.40 (7.87)−0.18.008
Outcome measures: age 15
 Achievement composite101.23 (11.63)112.72 (13.19)0.82.001
 Behavior composite47.12 (9.37)44.50 (8.66)−0.27.001

Note: In the columns for children with degreed and nondegreed mothers, the table reports the proportion of students falling within each delay-of-gratification category; all other values in these columns are means (with standard deviations in parentheses). The sample was split on the basis of mother’s education, and p values were derived from a series of regressions in which each characteristic was regressed on a dummy for whether mother graduated from college and a series of site fixed effects. Beta values represent effect sizes measuring the standardized differences between the two groups.

We adopted several approaches to dealing with this truncation problem, principally exploring possible nonlinearities in the associations between time waited and outcome measures by dividing the distribution of waiting times into discrete intervals. We also focused much of our analyses on the children of mothers who did not complete college, as far fewer of the children in this sample hit the ceiling on the minutes-waited measure, and as explained above, this group of children complements the sample of children included in the Mischel and Shoda studies. But because the subsample of children with college-educated mothers allows for a more direct replication of Mischel and Shoda’s famous work (e.g., Shoda et al., 1990 ), we also present results for them, bearing in mind the limitations imposed by the substantial delay truncation.

Finally, it should also be noted that children in the NICHD study were given only the version of the task that Shoda and colleagues (1990) called the diagnostic condition (i.e., the children were not offered strategies and were able to see the treat as they waited).

Academic achievement

Academic achievement was measured using the Woodcock-Johnson Psycho-Educational Battery Revised (WJ-R) test ( Woodcock, McGrew, & Mather, 2001 ), a commonly used measure of cognitive ability and achievement (e.g., Watts, Duncan, Siegler, & Davis-Kean, 2014 ). For math achievement at Grade 1 and age 15, we used the Applied Problems subtest, which measured children’s mathematical problem solving. At Grade 1, reading achievement was measured using the Letter-Word Identification task, a measure of word recognition and vocabulary, and at age 15, reading ability was measured using the Passage Comprehension test. The Passage Comprehension test asked students to read various pieces of text silently and then answer questions about their content.

For all the WJ-R tests, we used the standard scores, which were normed to have a mean of 100 and a standard deviation of 15 in each respective wave. We took the average of the Grade 1 math and reading measures and the age-15 math and reading measures, respectively, to create composite measures of academic achievement.

Behavioral problems

Following Shoda et al. (1990) , we relied primarily on mothers’ reports of child behavior. Mother-reported internalizing and externalizing behavioral problems were assessed using the Child Behavior Checklist (CBCL; Achenbach, 1991 ) at age 54 months, Grade 1, and age 15. The CBCL is a widely used measure of behavioral problems, and it includes approximately 100 items rated on 3-point scales that capture aspects of internalizing (i.e., depressive) and externalizing (i.e., antisocial) behavior. As with academic achievement, at Grade 1 and age 15, we averaged together the externalizing and internalizing measures to create a behavioral composite score that, before standardization, ranged from 32 to 83, with higher scores indicating higher levels of behavioral problems. We also tested models that used a host of alternative behavioral measures taken from youth reports and direct assessments at age 15; these measures and models are described in the Supplemental Material .

Additional covariates

All covariates included in our models are listed in Table 3 , and we grouped the covariates into two distinct sets of control variables: child background and Home Observation for Measurement of the Environment (HOME) controls and concurrent 54-month controls .

Descriptive Characteristics of All Control Variables

VariableChildren of nondegreed mothers Children of degreed mothers
Waited 7 min
( = 251)
Did not wait 7 min
( = 301)
β value for differenceWaited 7 min
( = 250)
Did not wait 7 min
( = 116)
β value for difference
Child background and HOME controls
Child background
 Proportion male.47.51−0.04.338.45.50−0.05.409
 Proportion White.82.640.18.001.94.850.10.007
 Proportion Black.07.24−0.15.001.00.05−0.05.024
 Proportion Hispanic.06.07−0.01.545.03.03−0.00.962
 Proportion other race/ethnicity.04.05−0.01.530.03.07−0.05.058
 Child’s age at delay measure (months)56.11
(1.11)
56.01
(1.14)
0.13.10555.99
(1.13)
55.99
(1.15)
0.07.519
 Birth weight (g)3490.23
(478.56)
3449.02
(540.26)
0.09.3203516.63
(520.52)
3572.53
(527.17)
−0.13.268
 BBCS standard score (36 months)9.06
(2.56)
7.67
(2.86)
0.47.00110.67
(2.20)
10.14
(2.35)
0.19.043
 Bayley MDI (24 months)93.89
(12.40)
85.91
(14.40)
0.53.001100.88
(11.78)
95.21
(14.10)
0.41.001
 Child temperament (6 months)3.18
(0.42)
3.25
(0.38)
−0.17.0533.13
(0.37)
3.09
(0.43)
0.07.531
 Log of family income (1–54 months)0.89
(0.61)
0.57
(0.73)
0.38.0011.54
(0.51)
1.42
(0.56)
0.14.057
 Mother’s age at birth (years)27.75
(5.66)
26.07
(5.46)
0.29.00131.58
(4.05)
31.87
(3.91)
−0.06.438
 Mother’s education (years)13.00
(1.41)
12.68
(1.50)
0.12.01717.02
(1.31)
16.82
(1.26)
0.07.234
 Mother’s PPVT score96.43
(13.38)
90.47
(17.03)
0.30.001114.10
(15.62)
105.63
(16.51)
0.44.001
HOME score (36 months)
 Learning Materials7.20
(2.36)
5.86
(2.51)
0.53.0018.64
(1.59)
8.41
(2.20)
0.12.168
 Language Stimulation6.13
(1.04)
5.67
(1.24)
0.46.0016.38
(0.84)
6.17
(1.13)
0.21.046
 Physical Environment6.16
(1.04)
5.64
(1.54)
0.40.0016.35
(0.83)
6.33
(0.91)
0.07.372
 Responsivity5.67
(1.28)
5.17
(1.52)
0.31.0016.09
(0.99)
5.81
(1.30)
0.21.033
 Academic Stimulation3.43
(1.21)
2.97
(1.29)
0.38.0013.74
(0.97)
3.57
(1.29)
0.17.112
 Modeling3.13
(1.10)
2.82
(1.14)
0.29.0013.64
(0.93)
3.51
(1.04)
0.11.285
 Variety6.80
(1.34)
6.14
(1.50)
0.45.0017.54
(1.17)
7.29
(1.36)
0.17.088
 Acceptance3.39
(0.85)
3.22
(1.04)
0.18.0383.70
(0.59)
3.57
(0.82)
0.13.162
 Responsivity-Empirical Scale5.54
(0.91)
5.14
(1.29)
0.37.0015.77
(0.52)
5.55
(0.91)
0.21.026
Concurrent 54-month controls
54-month WJ-R score
 Letter-Word Identification99.03
(11.98)
93.22
(12.63)
0.42.001105.93
(12.19)
102.31
(11.94)
0.26.011
 Applied Problems104.80
(12.88)
95.67
(15.72)
0.57.001112.36
(12.13)
106.06
(12.31)
0.40.001
 Picture Vocabulary100.54
(13.07)
93.74
(13.80)
0.43.001109.11
(13.45)
103.47
(13.58)
0.36.001
 Memory for Sentences93.21
(15.59)
85.43
(17.67)
0.43.001100.99
(18.73)
92.34
(17.45)
0.49.001
 Incomplete Words98.08
(12.91)
92.72
(13.52)
0.41.001102.18
(11.69)
98.05
(11.98)
0.35.001
54-month Child Behavior Checklist
 Internalizing47.36
(9.11)
47.94
(8.51)
−0.06.47746.55
(8.84)
46.81
(8.17)
−0.01.988
 Externalizing51.14
(9.34)
53.09
(9.84)
−0.21.02050.44
(9.11)
50.99
(8.53)
−0.06.604

Note: In the columns for children who did and did not wait 7 min, the table reports proportions for race/ethnicity; all other values in these columns are means (with standard deviations in parentheses). The p value column compares children who successfully completed the task and waited 7 min with children who did not, and the betas represent effect sizes measuring the standardized differences between the two groups. A series of regressions in which each variable was regressed on a dummy indicating whether the child completed the marshmallow test was used to generated p values, and a series of site dummy variables was also included to adjust for site differences ( p s below .001 have been rounded to .001). BBCS = Bracken Basic Concept Scale; HOME = Home Observation for Measurement of the Environment; MDI = Mental Development Index; PPVT = Peabody Picture Vocabulary Test; WJ-R = Woodcock-Johnson Psycho-Educational Battery Revised.

Child background and HOME controls

Child demographic characteristics (i.e., gender and race), birth weight, mother’s age at the child’s birth, and mother’s level of education were collected at the 1-month interview via interview with study mothers. Family income was collected from study mothers at the 1-, 6-, 15-, 24-, 36- and 54-month interviews. We took the average of all nonmissing income data over this span, and then log-transformed average family income to restrict the influence of outliers. Mother’s Peabody Picture Vocabulary Test (PPVT) score was assessed in a lab visit when the focal child was 36 months old. The PPVT is a commonly used measure of intelligence (e.g., see meta-analysis by Protzko, 2015 ).

We also included early indicators of child cognitive functioning, as measured at age 24 months by the Bayley Mental Development Index (MDI; Bayley, 1991 ) and at age 36 months by the Bracken Basic Concept Scale (BBCS; Bracken, 1984 ). The MDI measured children’s sensory-perceptual abilities, as well as their memory, problem solving, and verbal communication skills. The BBCS was an early measure of school readiness skills, and it required students to identify basic letters and numbers.

Child temperament was measured at age 6 months using the Early Infant Temperament Questionnaire ( Medoff-Cooper, Carey, & McDevitt, 1993 ), a 38-item survey to which mothers responded. This questionnaire asked mothers to rate their child on a 6-point Likert-scale with items focused on the child’s mood, adaptability, and intensity. We took the average score across these items as our measurement of temperament, with higher scores indicating more agreeable dispositions.

Finally, the set of controls measured prior to age 54 months also included indicators of the quality of the home environment, as measured by an observational assessment called the HOME inventory ( Caldwell & Bradley, 1984 ). The HOME was assessed when the focal child was approximately 36 months old, and it was designed to capture aspects of the home environment known to support positive cognitive, emotional, and behavioral functioning. We used nine subscales of the HOME in our models: The first eight subscales are commonly used with the HOME measure (Learning Materials, Language Stimulation, Physical Environment, Responsivity, Academic Stimulation, Modeling, Variety, and Acceptance), and the ninth subscale, called the Responsivity-Empirical Scale, was derived by the NICHD SECCYD study from factor analyses of the HOME items. This final scale was distinct from the traditional Responsivity scale, as it included items from the Language Stimulation scale that also measured mother responsivity and sensitivity to the child.

Concurrent 54-month controls

For models that included controls for concurrent cognitive and behavioral skills, we also included subscales taken from the age 54-month WJ-R test. As our measure of early reading, we included the Letter-Word Identification task, which tested children’s ability to sound out simple words, and the Applied Problems test at age 54 months was our measure of early math skills. For preschool children, the Applied Problems test requires them to count and solve simple addition problems. We also used the Memory for Sentences and Incomplete Words subtests as measures of cognitive ability. The Incomplete Words test measured auditory closure and processing, and children listened to an audio recording where words missing a phoneme were listed. They were then asked to name the complete word. Finally, the Picture Vocabulary test was a measure of verbal comprehension and crystallized intelligence. In this task, children were asked to name pictured objects. All of these tasks have been widely used as measures of children’s early cognitive skills and their measurement properties have been widely reported (e.g., Watts et al., 2014 ).

Finally, we also included the mother’s report of children’s externalizing and internalizing problems from the Child Behavior Checklist at age 54 months. Much like the measure used for age-15 behavioral problems, the 54-month survey included a battery of items designed to assess children’s antisocial and disruptive behavior (i.e., externalizing) and depressive symptoms (i.e., internalizing).

Our primary goal was to estimate the association between early gratification delay and long-run measures of academic achievement and behavioral functioning. Like the work of Shoda and colleagues (1990) , our study did not include a measure of gratification delay in which between-child differences were generated from some exogenous intervention, so we do not claim that the associations we estimated reflect causal impacts. Instead, our goal was to assess how much bias might be contained in longitudinal bivariate correlations between gratification delay and later outcomes as a result of failure to control for characteristics of children and their environments. Regression-adjusted correlations should provide better guidance regarding whether interventions boosting gratification delay might also improve later achievement and behavior.

To accomplish our analytic goals, we modeled later academic achievement and behavior (measured at both Grade 1 and age 15) as a function of a measure of gratification delay at age 54 months. We then tested models that added controls for background characteristics and measures of the home environment before moving to models that also included measures of cognitive and behavioral skills assessed at age 54 months (see Table 3 ).

These two approaches reflect different assumptions regarding how variation in gratification-delay ability might arise. Models with controls measured between birth and age 36 months still allow for variation in age 54-months gratification delay caused by the differential development of general cognitive or behavioral skills (e.g., executive function, self-control) between 36 and 54 months. Put another way, these models contain controls only for factors that even ambitious preschool-child-focused interventions are unlikely to alter (e.g., birth weight, temperament at 6 months of age, early home environment).

In contrast, the models with concurrent-54-months covariates controlled for variation in a range of cognitive capacities and behavioral problems developed by age 54 months. They helped to isolate the possible effects of an intervention that targets only the narrow set of skills involved with gratification delay (e.g., a program that merely provided children with strategies to help them delay longer; see Mischel, 2014 , p. 40) but not concurrent general cognitive ability or socioemotional behaviors.

Although it is impossible to know exactly how individual differences in gratification delay emerge (e.g., changes in parenting, development of cognitive skills), by controlling for factors unlikely to be altered by interventions (e.g., ethnicity, parental background), we can purge our estimates of bias due to observable characteristics that are correlated with gratification delay and later outcomes. If remaining unobserved factors also contribute to gratification delay and later outcomes (e.g., changes in parenting), and if these unobserved factors are unlikely to be altered by a particular intervention, then bias in our estimates may still remain. Yet our estimates should serve as an improvement over the unadjusted correlations reported previously (e.g., Shoda et al., 1990 ).

In all models shown, continuous variables were standardized so that coefficients could be read as effect sizes, and all models with control variables included a set of dummy variables for each site to adjust for any between-site differences. In order to account for missing data on control variables, we used structural equation modeling with full information maximum likelihood in Stata Version 15.0 ( StataCorp, 2017 ) to estimate all analytic models. Finally, we report all estimated p values to the thousandth decimal place (with p values below .001 displayed as < .001), and we describe any estimate corresponding to a p value less than .05 as statistically significant. Though we recognize the arbitrariness of focusing only on results with a p value less than .05, we selected this alpha level because it was the minimum threshold for statistical significance used in the studies we attempted to replicate and extend (i.e., Mischel et al., 1988 ; Mischel et al., 1989 ; Shoda et al., 1990 ). Consequently, any differences in conclusions reached between our studies and those reported in the previous literature should be attributed to design and sample differences rather than alpha-level choices.

Descriptive findings

Table 2 provides descriptive results for key analysis variables, including the 54-months delay-of-gratification measure, split by mother’s education level. In the sample of children with nondegreed mothers, children waited an average of 3.99 min ( SD = 3.08) before ending the task. We also present the proportion of children falling within certain ranges on the measure, with the 7-min category representing children who successfully completed the trial. In the lower-SES sample, 45% of children waited the maximum of 7 min, and 23% waited less than 20 s (i.e., 0.33 min). In the higher-SES sample, only 10% of children waited less than 20 s, and the average time waited was 5.38 min (statistically significantly longer than the lower-SES group, p < .001).

Because the 7-min ceiling presented a potential analytic challenge for both samples, we estimated models that substituted the four dummy categories shown in Table 2 for the continuous minutes-waited variable as a way to assess nonlinearities in the relationship between delay time and academic and socioemotional outcomes. Importantly, these models also provide information on how much our analysis might be compromised by the 7-min truncation.

Table 3 presents descriptive information for the various control measures used in the analysis, and means are presented separately for children who did and did not complete the delay task. In both the higher- and lower-SES samples, performance on the delay-of-gratification task was highly correlated with differences on most observable characteristics considered. For example, for children from nondegreed mothers, those who completed the delay-of-gratification task were from higher income families ( p < .001) than noncompleters, had mothers with higher PPVT scores ( p < .001), and had higher scores on dimensions of the HOME observational assessment ( p s = .04 to < .001). Null or smaller differences were generally observed for the children of degreed mothers, perhaps owing to the lack of heterogeneity in this subsample.

Regression results

Results for children of nondegreed mothers.

Table 4 presents coefficients and standard errors from models that estimated the association between delay of gratification at 54 months and our Grade 1 and age-15 achievement and behavioral composites for the sample of children from nondegreed mothers. Table 4 displays the results for a standardized continuous measure of gratification delay (i.e., the number of minutes waited during the marshmallow test). As Column 1 reflects, the bivariate association between minutes waited and academic achievement was 0.28 ( SE = 0.04, p < .001), considerably less than the .57 correlation Shoda and colleagues found for SAT math scores and the .42 correlation they found for verbal scores. These linear results suggest that children’s Grade 1 achievement would improve by approximately one tenth of a standard deviation for every additional minute waited at age 4. When the controls measured prior to age 54 months (second column of Table 3 ) were added to the model, the standardized association fell to 0.10 ( SE = 0.03, p = .002), and when concurrent 54-months controls were added (third column of Table 1 ), the association fell to a statistically nonsignificant 0.05 ( SE = 0.03, p = .114).

Associations Between Delay of Gratification at Age 54 Months and Later Measures of Academic Achievement and Behavior for Children of Mothers Without College Degrees

VariableAchievement composite Behavior composite
Grade 1 Age 15 Grade 1 Age 15
(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
Delay minutes (continuous)0.279
(0.038)
0.102
(0.033)
0.047
(0.030)
0.236
(0.037)
0.081
(0.034)
0.050
(0.032)
−0.060
(0.043)
−0.015
(0.044)
0.023
(0.044)
−0.062
(0.046)
−0.026
(0.047)
0.003
(0.042)
Delay minutes (categorical)
 < 0.333 minrefrefrefrefrefrefrefrefrefrefrefref
 0.333–2 min0.298
(0.126)
0.189
(0.105)
0.127
(0.093)
0.353
(0.122)
0.230
(0.103)
0.178
(0.098)
0.055
(0.144)
0.090
(0.138)
0.079
(0.105)
−0.140
(0.152)
−0.071
(0.148)
−0.106
(0.132)
 2–7 min0.424
(0.126)
0.206
(0.104)
0.041
(0.093)
0.457
(0.123)
0.300
(0.103)
0.235
(0.099)
−0.088
(0.144)
−0.020
(0.137)
0.039
(0.106)
−0.182
(0.151)
−0.109
(0.145)
−0.053
(0.131)
 7 min0.720
(0.098)
0.284
(0.086)
0.141
(0.078)
0.646
(0.098)
0.234
(0.088)
0.150
(0.084)
−0.121
(0.112)
−0.007
(0.114)
0.072
(0.087)
−0.193
(0.120)
−0.095
(0.123)
−0.048
(0.111)
value for test of equality of all categories.001.012.247.001.015.093.477.866.837.428.861.885
value for test of equality of second, third, and fourth categories.001.563.475.015.752.630.382.700.923.927.969.882
Control variables included
 Child background and HOMENoYesYesNoYesYesNoYesYesNoYesYes
 Concurrent 54 monthNoNoYesNoNoYesNoNoYesNoNoYes

Note: n = 552. For the continuous and categorical measures of delay minutes, the table gives standardized coefficients (with standard errors in parentheses). For the categorical measure, < 0.333 min was the reference category. Because outcome variables were standardized, coefficients can be interpreted as effect sizes. Estimates shown in the first column of each set (i.e., Columns 1, 4, 7, and 10) contained only the measure of delay of gratification and a given outcome measure. Estimates shown in the second column of each set (i.e., Columns 2, 5, 8, and 11) added child background characteristics, Home Observation for Measurement of the Environment (HOME) scores, and site dummy variables. Estimates shown in the third column of each set (i.e., Columns 3, 6, 9, and 12) added other behavioral and cognitive measures also measured at age 54 months. Post hoc chi-square tests were used to generate p values in order to assess whether respective sets of variables were different from one another ( p s below .001 have been rounded to .001).

Columns 4 through 6 show analogous models for the measure of achievement at age 15. The magnitudes of the age-15 correlations were remarkably similar to the Grade 1 correlations. The age-15 achievement correlation in the absence of other controls was of moderate size and statistically significant, β = 0.24, SE = 0.04, p < .001; but fell substantially when controls for earlier characteristics were added, β = 0.08, SE = 0.03, p = .016; and became nonsignificant when 54-months controls were added, β = 0.05, SE = 0.03, p = .140. Given that Shoda and colleagues found almost as strong correlations with later behavior as with later achievement, we were surprised to find virtually no relationship—even in the absence of controls—between delay of gratification and the composite score of mother-reported internalizing and externalizing at either Grade 1 or age 15 (right half of Table 4 ).

Children who waited less than 20 s (i.e., the lowest category) served as the comparison group for our models that represented delay times in a set of dummy variables (see Table 2 for the proportion of students in each category). As shown in Table 4 , models of outcomes at both Grade 1 and age 15 that lack control variables show a strong gradient between gratification delay and later achievement. Relative to children who waited less than 20 s, children who waited between 20 s and 2 min scored about one third of a standard deviation higher on the achievement measure at Grade 1 and age 15, and this difference grew to nearly three fourth of a standard deviation for the group that waited the entire 7 min. The entry for Model 1 in the row labeled “ p value for test of equality of second, third, and fourth categories” shows that the coefficients produced by the three groups of children who waited longer than 20 s differed significantly from one another ( p < .001), as did coefficient differences across all four categorical variables (the p value that is shown in the row labeled “ p value for test of equality of all categories”).

At both Grade 1 and age 15, when controls for early child and family characteristics were added to the model (Column 2 for Grade 1; Column 5 for age 15), the coefficients estimated for all three delay-time groups fell by roughly 50%. Surprisingly, the addition of the background controls also flattened out the gradient of the prediction across the gratification-delay distribution. Relative to the less-than-20-s reference group, achievement differences for children who waited more than 20 s but not the full 7 min were strikingly similar to the difference for children who waited the full 7 min. At age 15, the threshold nature of the relationship was most apparent; the coefficients produced by the three groups that waited longer than 20 s all fell between 0.23 and 0.30, and were not close to being statistically significantly different from one another ( p = .752).

When concurrent 54-months controls were added, coefficients fell even further. At age 15, only the coefficient produced by the group describing children who waited 2 to 7 min retained statistical significance (β = 0.24, SE = 0.10, p = .018), though once again the set of coefficients on the included categories of delay time did not differ from one another ( p = .630). As with the models shown for delay minutes in the achievement-composite columns in Table 4 , we found no statistically significant relationships between gratification delay and the first-grade and age-15 behavioral composites.

In our focal case of age-15 achievement, the return for delaying gratification appeared to be driven by differences between children who managed to wait at least 20 s and those who did not. Figure 1 illustrates this threshold effect with three lines showing the coefficients produced by our delay-of-gratification categories in the age-15 achievement models (i.e., the “Delay minutes (categorical)” section of Table 4 ). The solid line shows coefficients drawn from the no-control model (i.e., Column 4 of Table 4 ), the dashed line shows coefficients from the model with early controls (i.e., Column 5 of Table 4 ), and the dotted line shows coefficients produced by models with the 54-months controls (i.e., Column 6 of Table 4 ).

An external file that holds a picture, illustration, etc.
Object name is 10.1177_0956797618761661-fig1.jpg

Predicted achievement score by minutes of delay for children of mothers with no college degree. Error bars represent 95% confidence intervals. Values are shown separately for each of the four delay-of-gratification groups (< 0.333 min, 0.333–2 min, 2–7 min, 7 min); the x -axis shows the deviation in achievement composite scores from the reference group (delay < 0.333 min) against the within-group average amount of time waited. The average wait times for the models with no controls and with child background and Home Observation for Measurement of the Environment (HOME) controls only are displaced by ±.025 to distinguish the sets of error bars. The high-delay group’s coefficients are plotted at 7 min, although the 7-min truncation prevents us from knowing what the mean value of minutes waited would have been for this group in the absence of this limit.

The uncontrolled line has a steep initial jump, followed by a more gradual increase for wait times longer than 20 s. Both lines for the models with controls decrease after 4 min. Using 7 min to anchor the more-than-7-min group is probably an underestimate, but it is clear from the downward trajectory that no assumptions about the distribution of wait times above 7 min would produce a strong positive slope for the last segment of the line. Thus, in the case of children with mothers who lack college degrees, the truncation of delay time at 7 min does not affect the conclusion that children with the highest delay times show similar achievement levels at age 15 as other children who are able to delay for at least 20 s.

Results for children from mothers with college degrees

In Table 5 , we present key results for children of mothers with college degrees. As in Table 4 , we again present results for the continuous measure of delay of gratification and the categorical measures split along parts of the delay-of-gratification distribution. For the continuous measure, we again found evidence of positive unadjusted associations between delay of gratification and later achievement at both Grade 1 (β = 0.18, SE = 0.06, p = .001) and age 15 (β = 0.17, SE = 0.06, p = .007), and the categorical results suggested that much of this association was somewhat linear through the distribution. For the age-15 models, these relations became statistically indistinguishable from zero once controls were added, and the point estimate for the more-than-7-min category was surprisingly small and negative (β = −0.04, SE = 0.15, p = .816). As with the models shown in Table 4 , we again found no evidence of associations between delay of gratification and the behavioral measures at first grade or age 15 in the high-SES sample.

Associations Between Delay of Gratification at Age 54 Months and Later Measures of Academic Achievement and Behavior for Children of Mothers With College Degrees

VariableAchievement composite Behavior composite
Grade 1 Age 15 Grade 1 Age 15
(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
Delay minutes (continuous)0.178
(0.056)
0.120
(0.053)
0.048
(0.045)
0.167
(0.062)
0.062
(0.059)
0.007
(0.054)
−0.049
(0.057)
−0.059
(0.061)
−0.050
(0.046)
0.031
(0.059)
0.038
(0.063)
0.043
(0.055)
Delay minutes (categorical)
 < 0.333 minrefrefrefrefrefrefrefrefrefrefrefref
 0.333–2 min0.327
(0.220)
0.039
(0.198)
0.148
(0.168)
0.079
(0.245)
−0.131
(0.216)
−0.085
(0.197)
−0.069
(0.227)
−0.088
(0.228)
−0.184
(0.173)
−0.065
(0.231)
0.027
(0.232)
−0.083
(0.200)
 2–7 min0.397
(0.206)
0.147
(0.184)
0.134
(0.155)
0.216
(0.227)
0.028
(0.199)
−0.032
(0.182)
−0.277
(0.210)
−0.240
(0.209)
−0.265
(0.157)
−0.318
(0.218)
−0.217
(0.216)
−0.227
(0.185)
 7 min0.562
(0.166)
0.301
(0.154)
0.193
(0.131)
0.404
(0.183)
0.077
(0.166)
−0.036
(0.152)
−0.194
(0.168)
−0.208
(0.174)
−0.214
(0.131)
−0.007
(0.174)
0.068
(0.180)
0.052
(0.155)
value for test of equality of all categories.005.100.521.059.674.979.515.584.350.267.367.227
value for test of equality of second, third, and fourth categories.238.153.843.149.477.948.629.753.867.147.206.115
Control variables included
 Child background and HOMENoYesYesNoYesYesNoYesYesNoYesYes
 Concurrent 54 monthNoNoYesNoNoYesNoNoYesNoNoYes

Note: n = 366. For the continuous and categorical measures of delay minutes, the table gives standardized coefficients (with standard errors in parentheses). For the categorical measure, < 0.333 min was the reference category. Because outcome variables were standardized, coefficients can be interpreted as effect sizes. Estimates shown in the first column of each set (i.e., Columns 1, 4, 7, and 10) contained only the measure of delay of gratification and a given outcome measure. Estimates shown in the second column of each set (i.e., Columns 2, 5, 8, and 11) added child background characteristics, Home Observation for Measurement of the Environment (HOME) scores, and site dummy variables. Estimates shown in the third column of each set (i.e., Columns 3, 6, 9, and 12) added other behavioral and cognitive measures also measured at age 54 months. Post hoc chi-square tests were used to generate p values in order to assess whether respective sets of variables were different from one another ( p s below .001 have been rounded to .001). Estimates in this table can be directly compared with estimates from Table 4 . The sample was limited to children whose mothers had completed at least 16 years of education (i.e., completed college).

Despite statistically nonsignificant results, point estimates were sometimes positive and substantial (e.g., the 2–7 min group coefficient shown in Column 1; β = 0.40, SE = 0.21, p = .054), but the standard errors were nearly double those estimated for children of nondegreed mothers ( Table 4 ). This is due in part to the somewhat smaller sample size for the higher-SES sample but also to the lack of variation in the delay-of-gratification measure for this sample. Thus, although we found even less evidence of associations between delay of gratification and measures of later achievement when considering only the children of mothers with college degrees, it is difficult to draw strong conclusions from these models given the imprecise nature of their coefficient estimates.

Additional results and sensitivity checks

Heterogeneity.

Because we found little evidence supporting associations between early delay ability and later outcomes for the higher-SES sample, we next tested whether the different pattern of results observed between the higher- and lower-SES samples constituted a statistically significant difference. In Table 6 , we present models that included interaction terms between the various measures of delay of gratification (i.e., the continuous and categorical measures) and the indicator for whether the participant’s mother completed college. None of the interactions tested were statistically significant, and our series of joint F tests indicated that the set of interactions for the categorical measures of delay of gratification did not statistically significantly contribute to any of the models ( p s = .342–.968). However, as with the models that were run solely on the sample of children with college-educated mothers, standard errors were quite large for the interaction terms, indicating a substantial level of statistical imprecision. Unfortunately, the wide confidence intervals on many of the interaction terms render it impossible to provide a definitive answer to whether the relation between early delay ability and later achievement differs by SES.

Associations Between Delay of Gratification at Age 54 Months and Later Measures of Academic Achievement With Interactions Between Delay of Gratification and Socioeconomic Status

VariableAchievement composite Behavior composite
Grade 1 Age 15 Grade 1 Age 15
(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)
Delay minutes (continuous)0.279
(0.038)
0.115
(0.035)
0.050
(0.030)
0.236
(0.040)
0.083
(0.037)
0.040
(0.034)
−0.059
(0.042)
−0.019
(0.043)
0.012
(0.033)
−0.062
(0.044)
−0.023
(0.046)
0.009
(0.040)
High-SES indicator0.509
(0.064)
0.050
(0.068)
0.032
(0.059)
0.747
(0.067)
0.270
(0.071)
0.266
(0.066)
−0.187
(0.070)
0.026
(0.084)
0.031
(0.064)
−0.286
(0.074)
−0.119
(0.088)
−0.127
(0.077)
Interaction−0.101
(0.067)
−0.043
(0.058)
−0.035
(0.050)
−0.069
(0.069)
−0.007
(0.061)
−0.018
(0.057)
0.010
(0.073)
−0.038
(0.071)
−0.058
(0.054)
0.094
(0.076)
0.040
(0.075)
0.017
(0.066)
Delay minutes (categorical)
 < 0.333 minrefrefrefrefrefrefrefrefrefrefrefref
 0.333–2 min0.298
(0.127)
0.182
(0.110)
0.109
(0.096)
0.353
(0.131)
0.202
(0.115)
0.151
(0.107)
0.055
(0.140)
0.060
(0.137)
0.050
(0.104)
−0.140
(0.148)
−0.082
(0.145)
−0.097
(0.127)
 2–7 min0.424
(0.127)
0.215
(0.110)
0.053
(0.097)
0.457
(0.132)
0.288
(0.115)
0.199
(0.108)
−0.088
(0.140)
−0.046
(0.137)
0.006
(0.105)
−0.182
(0.146)
−0.103
(0.143)
−0.024
(0.126)
 7 min0.721
(0.099)
0.308
(0.090)
0.147
(0.079)
0.646
(0.105)
0.222
(0.097)
0.121
(0.091)
−0.121
(0.109)
−0.025
(0.112)
0.034
(0.086)
−0.193
(0.116)
−0.087
(0.120)
−0.028
(0.106)
High-SES indicator0.585
(0.174)
0.154
(0.156)
0.041
(0.136)
0.951
(0.178)
0.428
(0.163)
0.417
(0.151)
−0.097
(0.187)
0.163
(0.190)
0.191
(0.144)
−0.375
(0.195)
−0.185
(0.199)
−0.138
(0.174)
Interactions
 High SES × < 0.333 min0.029
(0.252)
−0.164
(0.218)
0.032
(0.190)
−0.274
(0.259)
−0.337
(0.226)
−0.266
(0.210)
−0.124
(0.275)
−0.127
(0.269)
−0.160
(0.205)
0.075
(0.284)
0.119
(0.276)
0.035
(0.243)
 High SES × 2–7 min−0.027
(0.240)
−0.138
(0.206)
0.010
(0.179)
−0.241
(0.246)
−0.293
(0.213)
−0.258
(0.198)
−0.188
(0.260)
−0.185
(0.252)
−0.199
(0.192)
−0.136
(0.272)
−0.090
(0.261)
−0.156
(0.229)
 High SES × 7 min−0.159
(0.192)
−0.119
(0.165)
−0.033
(0.144)
−0.242
(0.197)
−0.119
(0.173)
−0.134
(0.161)
−0.073
(0.207)
−0.167
(0.201)
−0.203
(0.153)
0.186
(0.217)
0.115
(0.212)
0.049
(0.186)
value from interaction-term joint test.668.870.968.640.342.507.899.859.610.450.753.720
Control variables included
 Child background and HOMENoYesYesNoYesYesNoYesYesNoYesYes
 Concurrent 54 monthNoNoYesNoNoYesNoNoYesNoNoYes

Note: n = 918. For the continuous and categorical measures of delay minutes, the table gives standardized coefficients (with standard errors in parentheses). For the categorical measure, < 0.333 min was the reference category. Because continuous variables were standardized, coefficients can be interpreted as effect sizes. Estimates shown in the first column of each set (i.e., Columns 1, 4, 7, and 10) contained only the measure of delay of gratification and a given outcome measure. Estimates shown in the second column of each set (i.e., Columns 2, 5, 8, and 11) added child background characteristics, Home Observation for Measurement of the Environment (HOME) scores, and site dummy variables. Estimates shown in the third column of each set (i.e., Columns 3, 6, 9, and 12) added other behavioral and cognitive measures also measured at age 54 months. Post hoc chi-square tests were used to generate p values in order to assess whether respective sets of variables were different from one another ( p s below .001 have been rounded to .001). The joint F test evaluated whether the set of interaction terms jointly contributed to the model. In other words, it tested whether the set of interactions were statistically significantly different from zero.

Measurement considerations

In Table 7 , we present correlations between the marshmallow test and all analysis variables for the full sample of children considered in our analyses ( n = 918; see the Supplemental Material for correlation matrices for both the lower-SES and higher-SES samples, respectively). In Table 7 , we also included the 54-month measure of the Continuous Performance Task (CPT; Barkley, 1994 ), which is a commonly used indicator of attention and impulsivity, and we included the Duckworth et al. (2013) parent- and teacher-report index of 54-month self-control (see the Supplemental Material for measurement details). We included these additional measures to further investigate how the marshmallow test might relate to theoretically relevant constructs (see Diamond & Lee, 2011 ). Surprisingly, the marshmallow test had the strongest correlation with the Applied Problems subtest of the WJ-R, r (916) = .37, p < .001; and correlations with measures of attention, impulsivity, and self-control were lower in magnitude ( r s = .22–.30, p < .001). Although these correlational results were far from conclusive, they suggest that the marshmallow test should not be thought of as a mere behavioral proxy for self-control, as the measure clearly relates strongly to basic measures of cognitive capacity.

Correlations Between All Analysis Variables

Variable12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
Gratification delay (54 months)
1. Continuous
2. < 0.333 min−.69
3. 0.333–2 min−.47−.18
4. 2–7 min−.07−.19−.16
5. 7 min.90−.51−.43−.45
Related measures
6. Self–control (54 months).24−.15−.15−.03.24
7. Attention (54 months).22−.18−.07−.08.24.15
8. Impulsivity (54 months)−.30.26.06.05−.28−.28−.26
Outcome measures
9. Achievement (Grade 1).31−.26−.08−.03.28.33.30−.27
10. Achievement (age 15).30−.25−.09−.02.27.32.20−.23.64
11. Behavior (Grade 1)−.08.06.05−.02−.07−.30−.08.05−.09−.11
12. Behavior (age 15)−.06.08.01−.04−.04−.23−.06.06−.11−.13.55
Background controls
13. Male−.05.06−.02.02−.05−.20−.01.23−.01.05−.00−.04
14. Black−.25.21.07.05−.24−.16−.12.20−.29−.33.06.00−.00
15. Hispanic−.03−.00.06−.02−.02−.04−.02.03−.05−.03.01.04.03−.08
16. Other−.04.00.03.04−.05−.00.02−.02.02.01−.01−.01−.03−.07−.05
17. Age.03−.04.03−.02.02.03.06−.02.04−.05.03.04−.00.03.01−.04
18. Log of income.30−.26−.08−.03.27.26.19−.19.37.40−.16−.17−.02−.36−.08−.01−.01
19. Mother’s age.20−.18−.05−.00.18.18.12−.14.22.32−.18−.21−.04−.28−.10−.04−.04.54
20. Mother’s education (years).25−.19−.09−.04.24.27.16−.20.35.42−.13−.17−.04−.22−.11−.03−.00.61.52
21. Mother’s PPVT score.28−.22−.09−.08.28.29.12−.18.35.48−.10−.07−.01−.37−.11−.09−.03.49.46.57
22. Site 1−.04.02.00.06−.06−.06.06−.02.03−.14.09.07−.00.11−.07−.07−.03−.09−.09−.06−.10
23. Site 2.00−.06.05.01.00.04.03−.03.06.10−.06−.07.00−.11.23−.01−.18.16.06.02.04−.10
24. Site 3.07−.05−.03−.02.07−.04.02−.09−.04−.08.04.02−.02−.02.08−.03.10−.07−.05.02−.02−.10−.11
25. Site 4−.00.02−.01−.01−.00.02.04.09−.02.05−.07−.04.02−.05−.04.04.00.05.07−.04−.02−.10−.11−.11
26. Site 5−.06.02.06−.00−.06.02.03.01−.02−.05−.02−.02.02.13−.06−.02.22−.05.03.02−.01−.11−.11−.11−.11
27. Site 6.03−.01−.04−.01.04.06.04−.03.04.09−.07−.08−.02.11−.06−.01−.10.12.09.13.10−.10−.10−.11−.10−.11
28. Site 7−.05.04.00.01−.04−.02−.10.12−.05.02−.02−.03.02.01.00.02.14−.09−.07−.08−.01−.11−.11−.11−.11−.12−.11
29. Site 8.06.00−.05−.09.09.10−.00−.08−.01.05−.02.05−.00−.05.00.11−.19.08.10.10.14−.11−.11−.11−.11−.12−.11−.12
30. Site 9−.04−.00.04.04−.06−.07−.01.00.05.02.03.03.01−.05−.06−.05.04−.06−.08−.12−.11−.11−.11−.11−.11−.12−.11−.12−.12
31. Birth weight (g)−.01.02.01−.06.02−.02.05−.01.11.10.02.09.12−.14.04−.07.01.04.05.07.13−.02.03−.06.04−.02−.03−.01.03.02
32. BBCS.28−.22−.10−.04.26.32.26−.29.54.50−.09−.10−.15−.32−.07−.02.01.45.32.42.40−.05.09.00.04.00.05−.14.05−.03.08
33. Bayley MDI.34−.27−.08−.06.31.29.24−.24.42.39−.08−.13−.17−.32−.08−.01−.02.40.23.36.34−.05.10−.02.11.02.02−.18−.06−.01.06.52
34. Temperament−.08.11.00−.02−.06−.14−.04.08−.11−.12.12.15−.04.17−.01.05−.01−.19−.19−.13−.19.06−.08−.01−.01−.04.02.02.02.03−.04−.15−.12
HOME controls
35. Learning Materials.29−.23−.11−.02.27.31.15−.23.38.40−.10−.11−.05−.39−.12−.08.03.49.35.48.47−.06−.09−.01−.04−.01.08−.04.00.05.07.47.43−.12
36. Language Stimulation.21−.18−.05−.04.20.17.08−.14.25.21−.01−.06−.03−.11−.12−.11.01.27.12.24.24.06−.16−.15−.20.02.18.01.05.09.10.28.23−.04.51
37. Physical Environment.20−.13−.13.02.17.15.13−.12.23.21−.09−.08.01−.24.00.01−.03.28.18.23.19−.07−.06−.05.03.09.02−.25−.05.19.01.25.24−.08.41.28
38. Responsivity.19−.13−.08−.05.20.18.14−.12.19.17−.09−.07−.02−.22−.06−.07−.09.32.26.28.27−.11−.01−.06−.02−.11.30−.30.12.06.08.31.25−.11.38.38.26
39. Academic Stimulation.21−.17−.06−.01.18.15.05−.15.23.20.00−.01−.03−.17−.09−.04.01.24.12.25.24−.01−.18−.06−.07.04.12−.11.05.09.08.33.26−.02.55.55.31.33
40. Modeling.17−.11−.06−.05.16.17.10−.07.23.25−.07−.04−.05−.15−.06−.06−.05.31.23.33.29−.00.01−.09−.08−.00.15−.08.03−.06.07.24.24−.10.37.31.26.28.28
41. Variety.25−.15−.14−.04.24.22.12−.21.28.29−.10−.07−.03−.27−.09−.05.01.41.27.39.37.02−.14−.06−.04−.03.16−.09.07.08.04.36.37−.09.56.41.33.33.43.35
42. Acceptance.12−.07−.07−.04.13.21.13−.17.16.19−.16−.14−.05−.10−.01.00−.04.23.21.24.20−.10−.00−.06.01−.04.14.04.04−.14.05.22.19−.05.28.20.17.19.14.32.23
43. Responsivity-Empirical Scale.20−.14−.08−.05.20.16.12−.10.20.16−.06−.05−.02−.18−.06−.04−.06.31.20.26.25.04−.03−.07−.03−.15.17−.12.08.02.04.24.19−.12.35.48.26.77.29.27.28.23
54-month controls
44. Letter-Word ID (WJ-R).28−.22−.09−.03.25.29.25−.24.60.49−.07−.07−.10−.20−.08.05−.01.38.19.38.34−.02.01−.01.03−.01.10−.08.03−.05.07.61.40−.08.40.29.23.26.34.23.32.19.21
45. Applied Problems (WJ-R).37−.28−.16−.01.33.35.32−.32.62.56−.04−.12−.10−.32−.07.01−.02.42.28.40.43−.08.03.01.07−.02.09−.08.04−.03.09.57.56−.13.43.24.27.25.25.18.31.22.21.58
46. Picture Vocabulary (WJ-R).28−.21−.08−.09.28.25.22−.18.42.50−.09−.04.10−.33−.10−.01.02.42.32.40.48−.12.01.01.07−.01.05−.04.10−.03.12.46.44−.12.43.25.23.27.28.21.38.16.22.46.52
47. Memory for Sentences (WJ-R).29−.25−.09−.02.26.28.22−.21.42.43−.11−.09−.04−.18−.11.04.02.29.21.28.30−.08−.06−.06.07.08.06−.05.04.03.08.39.43−.08.31.20.19.17.22.14.30.15.13.42.47.46
48. Incomplete Words (WJ-R).23−.17−.08−.06.22.15.19−.17.39.34−.05−.12−.00−.18−.10.01.03.24.18.24.27−.07−.06−.04.02.05.04.00−.05.13.09.30.36−.10.30.24.23.16.22.14.27.11.16.36.45.37.49
49. Internalizing (CBCL)−.04.04.02−.00−.04−.17−.05.08−.07−.08.53.38.03.04−.01.03.09−.06−.09−.10−.10.01−.04−.02.03.05−.05.01−.02−.03.04−.03−.03.14−.07−.04−.02−.04.02−.04−.07−.07−.06−.01−.04−.08−.06−.04
50. Externalizing (CBCL)−.10.07.07−.02−.09−.39−.07.09−.10−.12.63.47−.08.05.01−.00.01−.12−.13−.13−.11.04.01.03−.03.01−.03−.01−.06.01.04−.11−.05.12−.12−.06−.09−.10−.07−.11−.11−.14−.11−.04−.05−.09−.10−.05.58

Note: n = 918. All nonmissing cases for each pairwise correlation were included. The Supplemental Material presents correlations for all variables shown separately by mother’s education. BBCS = Bracken Basic Concept Scale; CBCL = Child Behavior Checklist; HOME = Home Observation for Measurement of the Environment; MDI = Mental Development Index; PPVT = Peabody Picture Vocabulary Test; WJ-R = Woodcock-Johnson Psycho-Educational Battery Revised.

In the Supplemental Material , we report further assessments of the extent to which self-control and attention could account for the associations between delay of gratification and later achievement. In Table S3, we included the 54-months measures of attention and impulse control taken from the CPT in the Table 4 models and found that inclusion of the CPT measures accounted for only 21% to 27% of the effect for the less-than-7-min group. In Table S4, we present results from a parallel analysis using the Duckworth et al. (2013) index of self-control, and again we found that coefficients were hardly reduced when the self-control index was included. The small change in the coefficient for the delay-of-gratification measure between models that did and did not include indicators of attention, impulsivity, and self-control raises further questions regarding what constructs are measured by the marshmallow test.

Alternative outcome measures

Returning to our focal sample of children with mothers who had not completed college, we were surprised to see the lack of significant associations between our delay-of-gratification measure and the behavioral measures at Grade 1 and age 15. We also tested models that used alternative indicators of behavior assessed at age 15, including measures of risky behavior from youth self-reports and assessments of impulse control. Surprisingly, we still found virtually no associations between delay of gratification and behavior across any of these alternative measures (Tables S5–S7 in the Supplemental Material ). Furthermore, because we relied on aggregated measures of achievement and behavior, we also tested separate models for math, reading, externalizing behaviors, and internalizing behaviors (Table S8 in the Supplemental Material ). Results indicated that the achievement associations were similar for both the math and reading measures, and we still found no statistically significant effects on either measure of problem behaviors.

We attempted to extend the famous findings of Mischel and Shoda ( Mischel et al., 1988 ; Mischel et al., 1989 ; Shoda et al., 1990 ) by examining associations between early delay of gratification and adolescent outcomes in a more diverse sample of children and with more sophisticated statistical models. As with the earlier studies, we found statistically significant, although smaller, bivariate associations between early delay ability and later achievement. But we also found that these associations were highly sensitive to the inclusion of controls. Moreover, we failed to find even bivariate associations between delay of gratification at age 54 months and a host of behavioral outcomes at age 15, which was remarkable given the stability in self-control measures found in other studies (e.g., Moffitt et al., 2011 ).

It surprised us that for the children of nondegreed mothers, most of the achievement boost for early delay ability was gained by waiting a mere 20 s. Shoda et al. (1990) argued that the relationship between delay of gratification and academic achievement might be driven by the ability to generate useful metacognitive strategies that will influence self-regulation throughout one’s life. Such strategies are unlikely to have played much of a role in a child’s ability to wait for only 20 s. Instead, our findings suggest that impulse control may be a key mechanism, although post hoc inclusion of an explicit measure of impulse control explained some but certainly not most of the delay-of-gratification effect.

These results create further questions regarding what the marshmallow test might measure and how it relates to the umbrella construct of self-control. We observed that delay of gratification was strongly correlated with concurrent measures of cognitive ability, and controlling for a composite measure of self-control explained only about 25% of our reported effects on achievement. These results suggest that the marshmallow test may capture something rather distinct from self-control. Indeed, Duckworth and colleagues (2013) also investigated the relations among delay of gratification, self-control, and intelligence using the data employed here, and they found that both self-control and intelligence mediated the relation between early delay ability and later outcomes. Our results further suggest that simply viewing delay of gratification as a component of self-control may oversimplify how it operates in young children.

When considering how our results might inform intervention development, recall that models with controls for concurrent measures of cognitive skills and behavior reduced the association between delay of gratification and age-15 achievement to nearly zero. This implies that an intervention that altered a child’s ability to delay but failed to change more general cognitive and behavioral capacities would likely have limited effects on later outcomes. If intervention developers hope to generate program impacts that replicate the long-term marshmallow test findings, targeting the broader cognitive and behavioral abilities related to delay of gratification might prove more fruitful.

Indeed, Mischel and Shoda’s original results ( Shoda et al., 1990 ) supported similar conclusions. Recall that they reported long-run correlations between delay of gratification and later outcomes only for children who were not provided with strategies for delaying longer. That the prediction was strong only in trials that relied on natural variation in children’s ability to delay suggests that unobserved factors underlying children’s delay ability may have driven the long-run correlations. Our results support this interpretation.

Our study is not without weaknesses. The 7-min ceiling was limiting, although our nonlinear models indicated that it was unlikely to affect conclusions drawn for the lower-SES sample. For the higher-SES sample, the 7-min ceiling prevented a direct replication of Mischel and Shoda’s original work (e.g., Shoda et al., 1990 ), as a substantial majority of higher-SES children hit the ceiling. The lack of precision in our higher-SES results was unfortunate, though it should be noted that point estimates in fully controlled models were often very small. At the very least, these results further suggest that bivariate associations between delay of gratification and later outcomes probably contain substantial bias, even for more privileged children.

It should also be noted that variation in our delay-of-gratification measure at age 54 months was not exogenous, so our models could not truly capture the effects that would be produced by exogenously spurred gains in early delay-of-gratification ability. However, our models included an extensive set of control variables that go well beyond the bivariate specifications employed in previous studies (e.g., Shoda et al., 1990 ). Finally, data not drawn to be nationally representative provide a shaky foundation for generalization.

In sum, our findings suggest that although early delay of gratification did indeed correlate with later achievement for children whose mothers had not completed college, the magnitude of this association was highly sensitive to the inclusion of control variables and did not appear to be linear across the delay-of-gratification distribution. Future work on delay of gratification should continue to examine the processes captured by the marshmallow test and whether early delay-of-gratification interventions would be worthwhile investments for promoting children’s long-run success.

Supplementary Material

Acknowledgments.

We are grateful to Ana Auger, Drew Bailey, Daniel Belsky, Jay Belsky, Clancy Blair, Peg Burchinal, Angela Duckworth, Dorothy Duncan, Jade Jenkins, Terrie Moffitt, Cybele Raver, and Deborah Vandell for helpful comments on drafts of this manuscript.

Action Editor: Brent W. Roberts served as action editor for this article.

Author Contributions: T. W. Watts and G. J. Duncan developed the study concept and design and wrote the manuscript. T. W. Watts and H. Quan analyzed the data. All authors approved the final manuscript.

Declaration of Conflicting Interests: The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.

Funding: This research was supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under award number P01-HD065704. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Supplemental Material: Additional supporting information can be found at http://journals.sagepub.com/doi/suppl/10.1177/0956797618761661

An external file that holds a picture, illustration, etc.
Object name is 10.1177_0956797618761661-img1.jpg

The primary data set used in this study was from the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development. Our data-use agreement prevented us from posting these data online, but the data set is available on request from the ICPSR website ( https://www.icpsr.umich.edu/icpsrweb/ICPSR/series/00233 ). The secondary data set was from the Early Childhood Longitudinal Program and can be found at the National Center for Education Statistics website ( https://nces.ed.gov/ecls/dataproducts.asp ). The three Stata files necessary to replicate the results given here in the tables, along with the complete Open Practices Disclosure for this article, can be found in the Supplemental Material ( http://journals.sagepub.com/doi/suppl/10.1177/0956797618761661 ).

The design and analysis plans for the study were not preregistered. This article has received the badge for Open Data. More information about the Open Practices badges can be found at http://www.psychologicalscience.org/publications/badges .

  • Achenbach T. M. (1991). Manual for the Child Behavior Checklist/4-18 Profile . Burlington: Department of Psychiatry, University of Vermont. [ Google Scholar ]
  • Barkley R. A. (1994). The assessment of attention in children . In Lyon G. R. (Ed.), Frames of reference for the assessment of learning disabilities: New views on measurement issues (pp. 69–102). Baltimore, MD: Brookes. [ Google Scholar ]
  • Bayley N. (1991). Bayley scales of infant development (2nd ed.). New York, NY: Psychological Corp. [ Google Scholar ]
  • Bembenutty H., Karabenick S. A. (2004). Inherent association between academic delay of gratification, future time perspective, and self-regulated learning . Educational Psychology Review , 16 , 35–57. [ Google Scholar ]
  • Bracken B. A. (1984). Bracken Basic Concept Scale . Chicago, IL: Psychological Corp. [ Google Scholar ]
  • Caldwell B. M., Bradley R. H. (1984). Home Observation for Measurement of the Environment . Little Rock: University of Arkansas at Little Rock. [ Google Scholar ]
  • Campbell D. (1986). Science’s social system of validity-enhancing collective belief change and the problems of the social sciences . In Fiske D., Shweder R. (Eds.), Metatheory in social science (pp. 108–135). Chicago, IL: University of Chicago Press. [ Google Scholar ]
  • Casey B. J., Somerville L. H., Gotlib I. H., Ayduk O., Franklin N. T., Askren M. K., . . . Shoda Y. (2011). Behavioral and neural correlates of delay of gratification 40 years later . Proceedings of the National Academy of Sciences, USA , 108 , 14998–15003. doi: 10.1073/pnas.1108561108 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Diamond A., Lee K. (2011). Interventions shown to aid executive function development in children 4 to 12 years old . Science , 333 , 959–964. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Duckworth A. L., Tsukayama E., Kirby T. A. (2013). Is it really self-control? Examining the predictive power of the delay of gratification task . Personality and Social Psychology Bulletin , 39 , 843–855. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Duncan G. J., Engel M., Claessens A., Dowsett C. J. (2014). Replication and robustness in developmental research . Developmental Psychology , 50 , 2417. [ PubMed ] [ Google Scholar ]
  • Flook L., Goldberg S. B., Pinger L., Davidson R. J. (2015). Promoting prosocial behavior and self-regulatory skills in preschool children through a mindfulness-based kindness curriculum . Developmental Psychology , 51 , 44–51. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Imuta K., Hayne H., Scarf D. (2014). I want it all and I want it now: Delay of gratification in preschool children . Developmental Psychobiology , 56 , 1541–1552. [ PubMed ] [ Google Scholar ]
  • Kidd C., Palmeri H., Aslin R. N. (2013). Rational snacking: Young children’s decision-making on the marshmallow task is moderated by beliefs about environmental reliability . Cognition , 126 , 109–114. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kumst S., Scarf D. (2015). Your wish is my command! The influence of symbolic modelling on preschool children’s delay of gratification . PeerJ , 3 , Article e774. doi: 10.7717/peerj.774 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Medoff-Cooper B., Carey W. B., McDevitt S. C. (1993). The Early Infancy Temperament Questionnaire . Journal of Developmental & Behavioral Pediatrics , 14 , 230–235. [ PubMed ] [ Google Scholar ]
  • Michaelson L. E., Munakata Y. (2016). Trust matters: Seeing how an adult treats another person influences preschoolers’ willingness to delay gratification . Developmental Science , 19 , 1011–1019. [ PubMed ] [ Google Scholar ]
  • Mischel W. (1974). Processes in delay of gratification . In Berkowitz L. (Ed.), Advances in experimental social psychology (Vol. 7 , pp. 249–292). New York, NY: Academic Press. [ Google Scholar ]
  • Mischel W. (2014). The marshmallow test: Why self-control is the engine of success . New York, NY: Little, Brown. [ Google Scholar ]
  • Mischel W., Ayduk O., Berman M. G., Casey B. J., Gotlib I. H., Jonides J., Shoda Y. (2010). ‘Willpower’ over the life span: Decomposing self-regulation . Social Cognitive and Affective Neuroscience , 6 , 252–256. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Mischel W., Shoda Y., Peake P. K. (1988). The nature of adolescent competencies predicted by preschool delay of gratification . Journal of Personality and Social Psychology , 54 , 687–696. [ PubMed ] [ Google Scholar ]
  • Mischel W., Shoda Y., Rodriguez M. L. (1989). Delay of gratification in children . Science , 244 , 933–938. [ PubMed ] [ Google Scholar ]
  • Moffitt T. E., Arseneault L., Belsky D., Dickson N., Hancox R. J., Harrington H., . . . Caspi A. (2011). A gradient of childhood self-control predicts health, wealth, and public safety . Proceedings of the National Academy of Sciences, USA , 108 , 2693–2698. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Murray J., Theakston A., Wells A. (2016). Can the attention training technique turn one marshmallow into two? Improving children’s ability to delay gratification . Behaviour Research and Therapy , 77 , 34–39. [ PubMed ] [ Google Scholar ]
  • NICHD Early Child Care Research Network. (2002). Early child care and children’s development prior to school entry: Results from the NICHD Study of Early Child Care . American Educational Research Journal , 39 , 133–164. [ Google Scholar ]
  • Protzko J. (2015). The environment in raising early intelligence: A meta-analysis of the fadeout effect . Intelligence , 53 , 202–210. doi: 10.1016/j.intell.2015.10.006 [ CrossRef ] [ Google Scholar ]
  • Robins L. N. (1978). Sturdy childhood predictors of adult antisocial behaviour: Replications from longitudinal studies . Psychological Medicine , 8 , 611–622. [ PubMed ] [ Google Scholar ]
  • Rodriguez M. L., Mischel W., Shoda Y. (1989). Cognitive person variables in the delay of gratification of older children at risk . Journal of Personality and Social Psychology , 57 , 358–367. [ PubMed ] [ Google Scholar ]
  • Romer D., Duckworth A. L., Sznitman S., Park S. (2010). Can adolescents learn self-control? Delay of gratification in the development of control over risk taking . Prevention Science , 11 , 319–330. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rueda M. R., Checa P., Cómbita L. M. (2012). Enhanced efficiency of the executive attention network after training in preschool children: Immediate changes and effects after two months . Developmental Cognitive Neuroscience , 2 , S192–S204. doi: 10.1016/j.dcn.2011.09.004 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Rybanska V., McKay R., Jong J., Whitehouse H. (2017). Rituals improve children’s ability to delay gratification . Child Development , 89 , 349–359. doi: 10.1111/cdev.12762 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Shimoni E., Asbe M., Eyal T., Berger A. (2016). Too proud to regulate: The differential effect of pride versus joy on children’s ability to delay gratification . Journal of Experimental Child Psychology , 141 , 275–282. [ PubMed ] [ Google Scholar ]
  • Shoda Y., Mischel W., Peake P. K. (1990). Predicting adolescent cognitive and self-regulatory competencies from preschool delay of gratification: Identifying diagnostic conditions . Developmental Psychology , 26 , 978–986. [ Google Scholar ]
  • StataCorp. (2017). Stata Statistical Software: Version 15.0 [Computer software]. College Station, TX: StataCorp LLC. [ Google Scholar ]
  • Watts T. W., Duncan G. J., Siegler R. S., Davis-Kean P. E. (2014). What’s past is prologue: Relations between early mathematics knowledge and high school achievement . Educational Researcher , 43 , 352–360. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Woodcock R. W., McGrew K. S., Mather N. (2001). Woodcock-Johnson Tests of Achievement . Itasca, IL: Riverside. [ Google Scholar ]

The Marshmallow Test for Grownups

by Ed Batista

Originally conducted by psychologist Walter Mischel in the late 1960s, the Stanford marshmallow test has become a touchstone of developmental psychology. Children at Stanford’s Bing Nursery School, aged four to six, were placed in a room furnished only with a table and chair. A single treat, selected by the child, was placed on the table. (In addition to marshmallows, the researchers also offered Oreo cookies and pretzel sticks.) Each child was told if they waited for 15 minutes before eating the treat, they would be given a second treat. Then they were left alone in the room.

Partner Center

The Bing “Marshmallow Studies”: 50 Years of Continuing Research

child experiment with marshmallow

By Janine Zacharia, Journalist and Bing Parent 

Walter Mischel’s pioneering research at Bing in the late 1960s and early 1970s famously explored what enabled preschool-aged children to forgo immediate gratification in exchange for a larger but delayed reward.

Resisting temptation, Mischel noted in a speech to several hundred Bing parents, is a problem that goes back to the story of Adam and Eve and the apple, and to Ulysses, who “tied himself to the mast to resist his temptations.” But until Mischel’s research at Bing, it was bypassed in modern science. Mischel, now a psychology professor at Columbia University, spoke at Stanford’s CEMEX Auditorium on Nov. 19, 2014.

The deliberately simple method Mischel devised to study willpower became known in popular culture as the “Marshmallow Test.” Mischel began by observing how those Bing children who could wait distracted themselves to avoid the temptations and used their imaginations to keep on waiting for their chosen goal. Some children turned their backs to the treats, or covered up their eyes so they couldn’t see them, or sang quietly to themselves (“Oh this is my home in Redwood City”). Others played with their toes as if they were piano keys, explored their nasal and ear cavities, or invented songs and games to amuse themselves to make the delay easier. And some sat quietly while giving themselves whispered self-instructions, repeating the contingency: “If I wait, then I get both; but if I don’t, then I just get one.”

This research identified some of the key cognitive skills, strategies, plans and mindsets that enable self-control. If the children focused on the “hot” qualities of the temptations (e.g., “The marshmallows are sweet, chewy, yummy”), they soon rang the bell to bring the researcher back. If they focused on their abstract “cool” features (“The marshmallows are puffy and round like cotton balls”), they managed to wait longer than the researchers, watching them through a one-way observation window, could bear. And when they imagined that the treats facing them were “just a picture” and were cued to “put a frame around it in your head” they were able to wait for almost 18 minutes. When Mischel asked a child how she managed to wait so long, she replied: “well you can’t eat a picture.”

These studies demystified willpower and showed how self-control and emotion regulation could be enhanced, taught and learned, beginning very early in life, even by children who initially had much difficulty delaying gratification.

The Bing research also yielded a surprise: What the preschoolers did as they tried to wait, unexpectedly predicted much about their future lives. “The more seconds they waited at age 4 or 5, the higher their SAT scores and the better their rated social and cognitive function in adolescence,” Mischel writes in his recent book, The Marshmallow Test: Mastering Self-Control .

Children who waited longer tended to become more self-reliant, more self-confident, less distractible and more able to cope with stress as adolescents, he said. But, he added, he has reassured anxious parents over the years that a child’s ability to delay gratification in preschool does not determine their future. “Clearly, your future is not in a marshmallow,” he said, debunking the pithy but incorrect way popular media have summed up his findings.

It’s a terrible mistake to think that if a child can’t wait 15 minutes, the child has serious problems, Mischel said in his talk. “If the child is waiting 15 minutes, it is telling you something important: Now you know she is able to wait effectively for something when she really wants it.” But if she doesn’t wait it may mean she just didn’t think it was worth waiting for.

Most important, the research over the years by Mischel and many others has helped to clarify the mental and brain mechanisms that underlie self-control, and stimulated decades of research on “executive function” and self-regulation. Many educators and parents use the findings to help children learn self-control. As the public interest in willpower has increased, so has the research on how self-control works. In one follow-up study, published in 2011, Bing participants returned to Stanford 40 years later so that the researchers could examine aspects of their brain activity and how they relate to their self-control earlier in life.

In his recent book, Mischel gave an example from his own life. Fifty years ago he was a three-pack-a-day smoker who once caught himself in the shower smoking a pipe. He knew he needed to stop. But it wasn’t until Mischel saw a patient at Stanford Hospital being wheeled on a gurney, his head and chest shaved smooth, little paint marks to show where the radiation should go to treat him for metastasized lung cancer, that he managed to quit. Imagining himself on that gurney helped him change his habit. Seeing and vividly remembering that cancer patient each time he was tempted to smoke made those potential future consequences immediate and powerful. He called this “pre-living a delayed outcome” so that it is experienced in the here and now and not discounted because it’s far off.

In his acknowledgments, he expresses gratitude “to the children and families whose contributions and unstinting cooperation, often over the course of many years” made the Bing research possible. Two of Mischel’s graduate students, Yuichi Shoda and Philip K. Peake, have continued to work with him for more than three decades on the research begun at Bing. Mischel, Shoda and Peake, will be honored on Sept. 17 at the Library of Congress with the “2015 Golden Goose Award,” given for federally funded long-term basic research that turns out, often unexpectedly, to have important applications for human welfare.    

The Stanford Marshmallow Test

practical psychology logo

Let’s say you are looking at a marshmallow. You have two options. You can either wait for the first marshmallow and get a second marshmallow. Or, you can have one marshmallow now - but then you’re done. 

How long would you wait for the second marshmallow? Would you wait for the second marshmallow?

This was the dilemma facing preschool-age children in the 1960s. Little did they know that the way they handled the dilemma would be part of one of the most famous psychological studies of all time. In this video, I’m going to tell you all about the Stanford Marshmallow Test: what it was, what it’s said about success, and the impact it’s had on psychology to this day. 

The Stanford Marshmallow Test 

The premise of the test was simple. Stanford professor Walter Mischel and his team put a single marshmallow in front of a child, usually 4 or 5 years old. They told the child that they would leave the room and come back in a few minutes. If the child ate the marshmallow, they would not get a second. If the child waited until the researcher was back in the room, the child would get a second marshmallow. 

Researchers recorded which children ate the marshmallow and which one waited. And then the researchers waited. When the children were teenagers, the researchers revisited the children and asked their parents a series of questions about their cognitive abilities, how they handled stress, and their ability to exhibit self-control under pressure. They also looked at the child’s SAT scores. A few years later, the researchers tested the participants again on their self-control. 

What did they find? In short, Mischel and his team found that developing self-control as a child had a profound impact on the child’s later success in life. Success came in many forms. In general, the children who waited for the second marshmallow:

  • Scored higher SAT scores
  • Reported lower levels of substance abuse
  • Were less likely to be obese 
  • Had better social skills and self-control, according to their parents

Due to the nature of the experiment, the results were published in the 1980s and the 1990s. Since then, the world of psychology has regarded the study as one of the important studies, paving the way for different ways of looking at how personality influences and predicts success. 

Why Would Children Eat The Marshmallow Right Away? 

The Marshmallow Test was able to give researchers a link between self-control and success. In short, having self-control as a child could influence success as an adult. But what influenced self-control? Not all children grabbed the marshmallow right away. 

Mischel and his team developed a “hot-and-cool” system of thinking that explained why children would have eaten the marshmallow immediately. This same system could be applied to any task that involves instant gratification, like making a purchase or smoking a cigarette. 

The “cool” system is where most of are when we’re not tempted. It’s the cognitive ability to think about long-term benefits. We know that smoking is bad for us, and resisting a cigarette will result in long-term health. We know that we will get more marshmallows if we wait. We know that if we go to the gym instead of hitting the snooze alarm, we will feel more awake later and more healthy in the long run. 

But “hot” stimuli threaten that cool system. When things warm up and get hot, our behavior becomes impulsive. We smoke the cigarette, take the marshmallow, and hit the snooze. 

Why do some people “heat up” faster than others? Why are some stimuli are “hotter” than others? These are questions that psychologists like Mischel are still trying to solve. 

Additional Studies Offer Different Explanations 

The Stanford Marshmallow Test took data from a relatively small and not exactly diverse group of participants. Not all researchers were convinced that the test had found the one true key to success. So a more recent study set out to redo the Marshmallow Test, focusing on different social and economic factors that could also play into a child’s success.

The main factor they chose was the mother’s educational background. They split participants into groups based on whether or not the mother had obtained a college degree. Researchers also controlled for factors like family background, early cognitive ability, and the child’s environment at home. 

In short, they found that self-control didn’t exactly have the impact on success that the Marshmallow Experiment said it did. Children who came from more wealthy homes were more likely to practice self-control. When the researchers accounted for social and economic factors, they found that self-control wasn’t necessary in predicting success. 

This doesn’t just put the results of self-control into question. It also questions why children grabbed the marshmallow in the first place. Researchers have offered different opinions, including thoughts on how scarcity impacts a child’s ability to take and use resources. Think about it. Wealthy kids don’t have too many problems waiting for food, toys, or other things. Their parents can afford it. They’ll get it. But kids from lower-income families have more to worry about. Food might be scarce. A parent may promise to bring their kids to a nice restaurant or buy them a nice toy, but may not be able to follow through. If something is right in front of you, you might as well take it - you can’t guarantee it will be there later, or that you’ll get a reward for waiting. 

The Impact of the Marshmallow Test

While recent studies have claimed to “debunk” the Marshmallow Test, it’s impossible to deny that the impact of the study. Mischel’s work was able to show the world how certain personality traits impacted a child’s chance at success. Further work has since been done on different personality traits and how they relate to success in business, love, etc. We might know the terms “growth mindset” or “emotional intelligence” if the Marshmallow Test didn’t exist. 

The Marshmallow Test is not the only classic experiment that has recently undergone criticisms. Psychology is currently undergoing what is called a “replication crisis.” Replications of world-renowned experiments like The Marshmallow Test and the Stanford Prison Experiment show that these tests aren’t as solid or accurate as were once taught in schools. Like any type of study that involves the scientific method, psychology is always evolving and psychologists are continuing to tweak, change, or adjust theories that don’t hold up to modern tests. 

Self-control does have an impact on behavior and possibly success, but it’s up to the current and future generations to learn more about just how self-control is influenced, and influences other personality traits and factors.

Related posts:

  • Albert Bandura's Bobo Doll Experiment (Explained)
  • The Milgram Shock Experiment
  • Free 3-in-1 Personality Test (5 Mins Quiz)
  • Stanford Prison Experiment
  • Free Online IQ Test (No Email + 5 Mins + Instant Results)

Reference this article:

About The Author

Photo of author

PracticalPie.com is a participant in the Amazon Associates Program. As an Amazon Associate we earn from qualifying purchases.

Follow Us On:

Youtube Facebook Instagram X/Twitter

Psychology Resources

Developmental

Personality

Relationships

Psychologists

Serial Killers

Psychology Tests

Personality Quiz

Memory Test

Depression test

Type A/B Personality Test

© PracticalPsychology. All rights reserved

Privacy Policy | Terms of Use

American Psychological Association Logo

Questionnaire

Acing the marshmallow test

In a new book, psychologist Walter Mischel discusses how we can all become better at resisting temptation, and why doing so can improve our lives.

By Lea Winerman

Monitor Staff

December 2014, Vol 45, No. 11

Print version: page 28

In a new book, psychologist Walter Mischel discusses how we can all become better at resisting temptation, and why doing so can improve our lives.

  • Personality

In a new book, psychologist Walter Mischel discusses how we can all become better at resisting temptation, and why doing so can improve our lives.

The plot is funny, but it's based on serious science. In a series of studies that began in the late 1960s and continue today, psychologist Walter Mischel, PhD, found that children who, as 4-year-olds, could resist a tempting marshmallow placed in front of them, and instead hold out for a larger reward in the future (two marshmallows), became adults who were more likely to finish college and earn higher incomes, and were less likely to become overweight.

So what's the lesson to take from this? It's not that the marshmallow test is destiny and that preschoolers who fail it are doomed, Mischel says. Instead, the good news is that the strategies the successful preschoolers used can be taught to people of all ages. By harnessing the power of executive function and self-control strategies, we can all improve our ability to achieve our goals. Today, Mischel's lessons are being applied on Sesame Street and in inner-city charter schools, among other places.

Mischel talked to the Monitor about his decades of research and his new book, "The Marshmallow Test: Mastering Self-Control," that sums them up.

Why is this research important?

To me it's a matter of helping kids to have the freedom to make choices. Whether or not they choose to eat the marshmallow, if they know how to wait for it, is up to them. But they should have the ability to have a real choice.

It doesn't mean that you spend your whole life self-controlling, obviously. A life that's all self-control can be as dismal as a life without any self-control. But it means that you need to have the skills plus the motivation if you want to really optimize your opportunities.

What first drew you to studying this?

It began when my children were young. I was a young faculty member at Stanford, and my three daughters were very closely spaced in age, between 2 and 5 years old. And I saw this remarkable progression that every parent sees, where their children go from clearly not having any self-control competencies, to, by the time they're 4 or 5, being able to control themselves reasonably well in many situations, and even do things like wait for dessert.

This drew me to the questions of how self-control is mastered, how it develops naturally, and what we can do to increase it in our children or ourselves. What are the mental, and — years later — what are the brain mechanisms that make emotional self-regulation and behavioral self-regulation possible?

It is really the story of resistance to temptation — the story of Adam and Eve in the Garden of Eden — that I was interested in. So that's how the marshmallow test was born.

Did you have any inkling back then that you'd be following these then-4-year-olds for the next half century?

Not in the slightest. I was just interested in the mental and behavioral processes that allow kids to self-control. I published a series of experiments in the '60s and '70s and '80s on that, and those studies in my opinion are in many ways more interesting than the follow-up findings that yes, kids who are good at self-control and delay of gratification become grownups who are good at self-regulation and self-control, and that there are substantial differences in outcomes that show that self-control is an extremely important cognitive and emotional skill set.

The good news is that this cognitive and emotional skill set is eminently teachable, particularly early in life. It's great in preschool; it's great within the first few years of life. It's great in adolescence even. And it continues to be a skill set that can be developed even when we're quite mature adults.

Have you found the same effect in other populations?

Yes. To me one of the most exciting findings is in kids from the South Bronx. We've done five-year studies with them, and found that self-control ability has very important protective effects against our own vulnerabilities. So, for example, people who are highly sensitive to exclusion and rejection often find themselves in a pattern that they are so anxious and worried about rejection that they actually behave in ways toward their friends and partners that make them get rejected … so it's a self-fulfilling prophecy.

Well, we followed kids from the South Bronx for five years, and found that the ones who have high rejection sensitivity, but who are able to wait to get a big bag of M&Ms a week later rather than settle for a few M&Ms now, are much better at not being, for example, as inclined toward aggressiveness. They're able to control their reactions, and so they don't see the same result from their rejection sensitivity. So, self-control has this protective effect that's hugely important, and I think that's very important for clinical psychologists to have in mind. It implies that teaching kids to improve self-control skills can have a protective effect because it allows them to deal better with whatever their own "hot spots" are.

What are some ways those skills can be taught, especially to children?

They can be taught in many interesting and child-friendly ways. A particularly good example of that are studies that Michael Posner and his colleagues at the University of Oregon reported in the Proceedings of the National Academies of Sciences in 2005. They worked with 4- to 6-year-old kids, and the idea was to help them acquire better executive function skills using a computer game. For example, in one of the exercises, there's a cat in a rainstorm. The children's job is to use a joystick that controls an umbrella to keep the cat dry as it runs around. So this is teaching executive function: I have a goal, I have to keep the umbrella over the cat's head, I can't get distracted and start looking around, I have to keep that umbrella over the cat's head.

Posner and his crew found that five sessions of 40 minutes each with these kids led to substantial increases in their executive function and executive control, and even some increase in nonverbal IQ measures.

And, of course, there's the work with Sesame Street.

They asked me give a talk, and I wound up working with them in a minor advisory role. We decided to create situations in which Cookie Monster must learn to control himself because he's got a new goal, which is to join the cookie connoisseur gourmet club — for which you have to wait for your cookies. So Cookie Monster learns strategies, for example "framing" the marshmallows and pretending that they're just a picture, because if it's just a picture you can't eat it. Cookie Monster also comes up with the idea that if he imagines the cookies are smelly fish, then he won't want them. And so on.

The idea is for kids to be exposed to strategies that teach them what executive control is, while they're having fun.

What's crucial to remember is that there are two components to applying this work. One is that kids need to know how to do it — they need to have the cognitive control skills. But they also need the motivation to do it — they have to want to change. And that is exactly what every therapist will tell you. You don't get very far unless people want to change their behavior.

Is it as effective for kids to learn this from Sesame Street as it would be for them to get it from their parents or in the classroom?

I'm not aware of systematic comparison studies that would compare that. But it doesn't hurt to go on all fronts.

At the same time, it's also wonderful that they're being implemented, for example, in the KIPP schools in New York, which I've also been involved with. There, the strategies we're talking about, including the development of the skills that are involved in building character qualities like grit and persistence, tolerance of frustration, gratitude, optimism, excitement and energy as you enter a project … all of these things are being incorporated in many school programs by educators. And I think that's the most exciting way to go.

For preschool-age children, there's a very interesting study reported in Science in 2007 by Adele Diamond and her group, on Tools of the Mind, which is a program for teaching executive function and executive skills to preschoolers. It turned out to be highly effective for kids from highly impoverished, high-stress areas.

So there's a huge amount of evidence that's accumulated from brain studies, behavioral studies and educational studies that makes it very clear that there are methods for enhancing the kinds of skills and qualities and emotion regulation that kids need if they're going to do well in school.

How do parents develop these skills in their children?

Step one is, if you want your children to have self-control, you need to model it. If you make promises, you need to keep them. You can't expect kids to delay gratification if you're breaking your own promises to them. Kids also need to learn that their behavior has consequences. If they behave in constructive and creative ways, the consequences are good. And if they behave in destructive ways, the consequences are not so good. They need to become aware that there's a relationship between what they're doing and what happens to them so that they can develop a sense of agency, a sense of mastery and a sense that they can control their own behavior.

One last question: Are you still following the original participants from the Stanford marshmallow study?

Yes. They're now between their very late forties and early fifties. And we are in the middle of a study, in collaboration with a team of economists from Harvard University, where we've administered to a sample of about 110 of the original Stanford kids a very extensive set of economic outcome measures, in order to see what's the relationship between maintaining these two different patterns of high self-control over the life course versus low self-control over the life course, and economic outcomes.

Now, those are the two trajectories we've studied — consistently high control and consistently low control. But if I had another lifetime, and funding, I would also study the kids who start high, which means they demonstrated that they have good executive control in the original marshmallow study, but over the years they've gone down and down in self-control. And those who start low and went high over the life course. Those are smaller trajectories, but they exist, and they are the ones that we don't know much about.

Letters to the Editor

  • Send us a letter

Greater Good Science Center • Magazine • In Action • In Education

Kids Do Better on the Marshmallow Test When They Cooperate

Imagine you’re a young child and a researcher offers you a marshmallow on a plate. But there’s a catch: If you can avoid eating the marshmallow for 10 minutes while no one is in the room, you will get a second marshmallow and be able to eat both. What would you do—eat the marshmallow or wait?

This is the premise of a famous study called “the marshmallow test,” conducted by Stanford University professor Walter Mischel in 1972. The experiment measured how well children could delay immediate gratification to receive greater rewards in the future—an ability that predicts success later in life. For example, Mischel found that preschoolers who could hold out longer before eating the marshmallow performed better academically, handled frustration better, and managed their stress more effectively as adolescents. They also had healthier relationships and better health 30 years later.

For a long time, people assumed that the ability to delay gratification had to do with the child’s personality and was, therefore, unchangeable. But more recent research suggests that social factors— like the reliability of the adults around them—influence how long they can resist temptation. (If children learn that people are not trustworthy or make promises they can’t keep, they may feel there is no incentive to hold out.)

child experiment with marshmallow

Now, findings from a new study add to that science, suggesting that children can delay gratification longer when they are working together toward a common goal.

In the study, researchers replicated a version of the marshmallow experiment with 207 five- to six-year-old children from two very different cultures—Western, industrialized Germany and a small-scale farming community in Kenya (the Kikuyu). Kids were first introduced to another child and given a task to do together. Then, they were put in a room by themselves, presented with a cookie on a plate, and told they could eat it now or wait until the researcher returned and receive two cookies. (The researchers used cookies instead of marshmallows because cookies were more desirable treats to these kids.)

Some kids received the standard instructions. But others were told that they would get a second cookie only if they and the kid they’d met (who was in another room) were able to resist eating the first one. That meant if both cooperated, they’d both win.

To measure how well the children resisted temptation, the researchers surreptitiously videotaped them and noted when the kids licked, nibbled, or ate the cookie. If children did any of those things, they didn’t receive an extra cookie, and, in the cooperative version, their partner also didn’t receive an extra cookie—even if the partner had resisted themselves.

Results showed that both German and Kikuyu kids who were cooperating were able to delay gratification longer than those who weren’t cooperating—even though they had a lower chance of receiving an extra cookie. Apparently, working toward a common goal was more effective than going it alone.

“For children, being in a cooperative context and knowing others rely on them boosts their motivation to invest effort in these kinds of tasks—even this early on in development,” says Sebastian Grueneisen, coauthor of the study.

Grueneisen says that the researchers don’t know why exactly cooperating helped. It could be that relying on a partner was just more fun and engaging to kids in some way, helping them to try harder. Or it could be that having an opportunity to help someone else motivated kids to hold out. After all, a similar study found that children are able to resist temptation better when they believe their efforts will benefit another child. Or perhaps feeling responsible for their partner and worrying about failing them mattered most.

Whatever the case, the results were the same for both cultures, even though the two cultures have different values around independence versus interdependence and very different parenting styles—the Kikuyu tend to be more collectivist and authoritarian, says Grueneisen. This points toward the possibility that cooperation is motivating to everyone.

“I would be careful about making a claim that this is a human universal. But our findings point in that direction, since they can’t be explained by culture-specific socialization,” he says.

This would be good news, as delaying gratification is important for society at large, says Grueneisen. Achieving many social goals requires us to be willing to forego short-term gain for long-term benefits. For example, preventing future climate devastation requires a populace that is willing to do with less and reduce their carbon footprint now.

Further testing is needed to see if setting up cooperative situations in other settings (like schools) might help kids resist temptations that keep them from succeeding—something that Grueneisen suspects could be the case, but hasn’t yet been studied. Or if emphasizing cooperation could motivate people to tackle social problems and work together toward a better future, that would be good to know, too.

“Cooperation is not just about material benefits; it has social value,” says Grueneisen. “In situations where individuals mutually rely on one another, they may be more willing to work harder in all kinds of social domains.”

About the Author

Headshot of Jill Suttie

Jill Suttie

Jill Suttie, Psy.D. , is Greater Good ’s former book review editor and now serves as a staff writer and contributing editor for the magazine. She received her doctorate of psychology from the University of San Francisco in 1998 and was a psychologist in private practice before coming to Greater Good .

You May Also Enjoy

child experiment with marshmallow

How to Pass the Marshmallow Test

child experiment with marshmallow

How to Help Your Kids Learn to Stick with It

child experiment with marshmallow

Helping Kids Delay Gratification

A young girl with a bob haircut, wearing a white shirt with black polka dots, rests her chin on her hand and looks tiredly into the distance, standing against a plain white background.

How to Help Your Kids Be a Little More Patient

child experiment with marshmallow

Can Mindfulness Help Kids Learn Self-Control?

child experiment with marshmallow

How to Be More Patient (and Why It’s Worth It)

GGSC Logo

The surprising thing the ‘marshmallow test’ reveals about kids in an instant-gratification world

  • Copy Link URL Copied!

Here’s a psychological challenge for anyone over 30 who thinks “kids these days” can’t delay their personal gratification: Before you judge, wait a minute.

It turns out that a generation of Americans now working their way through middle school, high school and college are quite able to resist the prospect of an immediate reward in order to get a bigger one later. Not only that, they can wait a minute longer than their parents’ generation, and two minutes longer than their grandparents’ generation could.

It may not sound like much, but being able to hold out for an extra minute or two at a young age may serve them well in the long run. Research suggests that superior results on a delayed-gratification task during the toddler years is associated with better performance in school and in jobs, healthier relationships, and even fewer chronic diseases.

Those findings emerge from a new effort to understand how children’s ability to hold out for the promise of more has changed over time. The study , published this week in the journal Developmental Psychology, resurrected an experiment that’s become a developmental psychology classic : the so-called marshmallow test.

Pioneered in the 1960s by a young Stanford psychology professor named Walter Mischel , the marshmallow test left a child between the ages of 3 and 5 alone in a room with two identical plates, each containing different quantities of marshmallows, pretzels, cookies or another delicious treat. Before leaving the room “to do some work,” the adult researcher instructed the child that the single treat on one plate could be eaten at any time. But if the child could wait for him to return before eating it, the researcher added, she could have the second, bigger treat instead.

After the experimenter closed the door on the subject, researchers on the other side of a two-way mirror monitored the child’s bout with temptation and recorded how long he or she could hold out before licking or eating the treat.

Replicated many times and followed up by a wide range of researchers, the marshmallow test has earned recognition as a powerful predictor of future performance — at least among the white children of well-educated parents. Compared to kids who lunged for the early reward, those who held out for a bigger prize did better in school, got higher SAT scores, had higher self-esteem and better emotional coping skills, and were less likely to abuse drugs.

Other studies found that children unable to defer gratification were more likely to be become overweight or obese 30 years later and were in worse general health in adulthood .

The results focused psychologists, early-childhood educators and parents on the key role that self-regulation and executive function can play in a child’s prospects, and on the need to nurture those skills well before kindergarten.

The new study — conducted by Mischel (now at Columbia University) and colleagues around the country — suggests that focus has payed off.

Among the 165 children who participated in the first round of experiments at Stanford from 1965 to 1969, the task tended to be either very hard or pretty easy: close to 30% gobbled up the single treat within 30 seconds of the researchers’ departure from the room, while just over 30% were able to wait the 10 minutes that was the outer limit of the researcher’s absence. Most of the children who did not hold out for 10 minutes ate the treat within six minutes.

These original subjects are now between 52 and 58 years old.

When the marshmallow experiment was replicated in a group of 135 New York City preschoolers from 1985 to 1989, changes seemed to be afoot. About 16% of the kids held out for just 30 seconds or less before snarfing the treat, and about 38% held out for 10 minutes. In between, the trend was for longer holdouts.

These subjects are now between 32 and 38 years old.

By the time University of Minnesota psychologist Stephanie M. Carlson and colleagues at the University of Washington in Seattle ran the exact same experiment with 540 kids from 2002 to 2012, the changes appeared to be real. Close to 60% of the children tested held out the full 10 minutes for a bigger reward. And only about 12% claimed their reward in the first half-minute.

These kids — like the two earlier cohorts, overwhelmingly white from families with relatively high incomes and educational attainment — are now between 11 and 21 years old.

On average, they waited two minutes longer (during a 10-minute period) than those from the 1960s before seizing their reward. And they waited one minute longer than those tested in the 1980s.

Surprised? You’re not alone.

In a survey conducted before performing the new analysis, the study authors found that adults in the United States “generally intuit” that children today are less tolerant of delayed gratification and less self-controlled than children were 50 years ago.

Roughly three-quarters of a representative sample of U.S. adults did not believe that children these days would show much self-restraint for a better reward. And parents — Latino parents especially — were overwhelmingly convinced their own kids would not delay gratification as long as they would have when they were 4 years old.

Carlson wasn’t so sure. On the one hand, she wondered how kids’ self-control would hold up under the influence of daily television and amid a dramatic rise in attention deficit and hyperactivity disorder (ADHD) diagnoses.

On the other, she knew that research has chronicled a steady rise in kids’ IQ scores — the so-called Flynn effect — which correlates with executive function. And she knew that a growing portion of kids’ screen time, including video games and some social media , can help them learn to manipulate language and other abstractions to accumulate social approval and other rewards.

Higher preschool enrollment and changes in parenting styles, including the rise of the empowered child, also might contribute to generational improvements in kids’ ability to delay gratification, Carlson said. After all, only 15.7% of all 3- and 4-year-olds in the United States attended preschool in 1968. By the year 2000, more than half of kids that age attended schools that stressed social skills and self-control as cornerstones of educational readiness.

Plus, Carlson looked at her own daughters, now 19 and 22, and thought to herself, the kids just might be OK.

The findings “do make me hopeful,” she said. Not only have qualities like perseverance and self-control not disappeared; a simple and unchanged measure of those qualities — the marshmallow test — has withstood many trials, including the test of time.

“Delay of gratification is still a good bellwether of these self-regulation and executive function skills, and we’re learning more every day about how important they are for school readiness and achievement,” Carlson said.

The next challenge, she added, will be to take the marshmallow test into more diverse communities and understand better if it has the same predictive power in kids who are not white, affluent and from well-educated families.

[email protected]

@LATMelissaHealy

MORE IN SCIENCE

Surprising discovery about viruses and Alzheimer’s disease could open new avenues for treatment

Extinct species of gibbon discovered in an unlikely place — the tomb of a Chinese noblewoman

What it’s like to be interviewed for a job by Koko the gorilla: ‘She had a lot to say’

More to Read

B2B Announcements 5-23

Blood test at birth could eventually identify babies at increased risk of SIDS

Sept. 9, 2024

A red Elmo puppet posing against a blue background

What Elmo — and his human friends — learned by asking Americans about their mental health

Aug. 15, 2024

Anaheim, CA - August 10: A child weeps with her parent before joining her classmates on the first day of school for the Anaheim Elementary School District at Roosevelt Elementary in Anaheim Thursday, Aug. 10, 2023. (Allen J. Schaben / Los Angeles Times)

Summer break is ending. Here are 10 ways parents can help their kids get back into school mode

Aug. 9, 2024

child experiment with marshmallow

Melissa Healy is a former health and science reporter with the Los Angeles Times who wrote from the Washington, D.C., area. She covered prescription drugs, obesity, nutrition and exercise, and neuroscience, mental health and human behavior. She was at The Times for more than 30 years, and has covered national security, environment, domestic social policy, Congress and the White House.

More From the Los Angeles Times

Eight days after its final encounter with the Earth, the Galileo spacecraft looked back and captured this remarkable view of the Earth and Moon. The image was taken from a distance of about 6.2 million km (3.9 million miles).

What you need to know about Earth’s new, temporary mini-moon

Sept. 19, 2024

FILE This Oct. 12, 2009 photo shows a petri dish with methicillin-resistant Staphylococcus aureus (MSRA) cultures at the Queen Elizabeth Hospital in King's Lynn, England. The U.S. toll of drug-resistant “superbug” infections worsened during the first year of the COVID-19 pandemic, health officials said Tuesday, July 12, 2022. After years of decline, the nation in 2020 saw a 15% increase in hospital infections and deaths attributed to some of the most worrisome bacterial infections out there, according to a Centers for Disease Control and Prevention report. (AP Photo/Kirsty Wigglesworth, File)

Science & Medicine

Drug-resistant germs will kill millions more people in coming decades, researchers warn

Sept. 18, 2024

Tortured by regret? Here’s a trick to make peace with the past

LOS ANGELES, CA - MARCH 03: Jennifer Siebel Newsom, right, California's First Partner and wife of Gov. Gavin Newsom, with panelist and California Surgeon General Diana Ramos, left, speaking on a panel "Women's Health and Our Futures" part of the USC Women's Conference at Bovard Auditorium on the University of Southern California Campus on Friday, March 3, 2023 in Los Angeles, CA. Panelist First Partner of California Jennifer Siebel Newsom, California Surgeon General Diana Ramos and Keck School of Medicine of USC Dean Carolyn Meltzer. (Gary Coronado / Los Angeles Times)

California surgeon general sets goal of reducing maternal mortality by 50%

Sept. 17, 2024

The Marshmallow Test Gets More Complicated

A new study finds that in a study of self control, the perception of trustworthiness matters

Sarah Zielinski

Sarah Zielinski

A four-year-old girl reenacts the marshmallow test (Credit: J. Adam Fenster / University of Rochester)

When I wrote about the marshmallow test several years ago , it seemed so simple:

A child was given a marshmallow and told he could either ring a bell to summon the researcher and get to eat the marshmallow right away or wait a few minutes until the researcher returned, at which time the child would be given two marshmallows. It’s a simple test of self control, but only about a third of kids that age will wait for the second marshmallow. What’s more interesting, though, is that success on that test correlates pretty well with success later in life. The children who can’t wait grow up to have lower S.A.T. scores, higher body mass indexes, problems with drugs and trouble paying attention.

The initial finding hasn’t been overturned, but a new study in the journal Cognition is adding a layer of complexity to the test with the finding that whether the child perceives the researcher as trustworthy matters.

“Our results definitely temper the popular perception that marshmallow-like tasks are very powerful diagnostics for self-control capacity,” Celeste Kidd, a doctoral candidate in brain and cognitive sciences at the University of Rochester and the study’s lead author, said in a statement .

Kidd and her colleagues started their experiment by adding a step before giving their group of 28 three- to five-year-old children the marshmallow test: Similar to the marshmallow test, the children were given an art task, with a researching placing before a child either a well-worn set of crayons or a small sticker. The children were promised a better art supply (new crayons or better stickers) if they waited for the researcher to come back. With half of the children, though, the researcher didn’t follow up on that promise, telling the kid that better supplies were unavailable.

And then the researcher administered the marshmallow test.

Children who had been primed to believe that the researcher was reliable waited an average of 12 minutes before eating the marshmallow, but those in the “unreliable” group waited only three minutes. What’s more, nine out of 14 children in the “reliable” group were able to wait the full 15 minutes for the researcher to return, while only one kid in the unreliable group was able to wait that long.

“Delaying gratification is only the rational choice if the child believes a second marshmallow is likely to be delivered after a reasonably short delay,” Kidd said. Self control isn’t so important, it seems, if you don’t think there’s anything worth controlling yourself for.

Kidd got interested in the test after volunteering at a homeless shelter. “There were lots of kids staying there with their families. Everyone shared one big area, so keeping personal possessions safe was difficult,” Kidd said. “When one child got a toy or treat, there was a real risk of a bigger, faster kid taking it away. I read about these studies and I thought, ‘All of these kids would eat the marshmallow right away.’ ”

The study doesn’t invalidate the marshmallow test– willpower is still important–but it does mean that people shouldn’t look at kids who fail the test as being instantly doomed to failure. Instead, parents of kids who appear to lack self control might want to look more closely at why they would eat the marshmallow–is it because they can’t wait or because they can’t trust that the next marshmallow will appear?

Get the latest Science stories in your inbox.

Sarah Zielinski

Sarah Zielinski | | READ MORE

Sarah Zielinski is an award-winning science writer and editor. She is a contributing writer in science for Smithsonian.com and blogs at Wild Things, which appears on Science News.

The marshmallow test, revisited

Boy in a pile of marshmallows

When kids “pass” the marshmallow test, are they simply better at self-control or is something else going on? A new UC San Diego study revisits the classic psychology experiment and reports that part of what may be at work is that children care more deeply than previously known what authority figures think of them.

In the marshmallow test, young children are given one marshmallow and told they can eat it right away or, if they wait a while, while nobody is watching, they can have two marshmallows instead. The half-century-old test is quite well-known. It’s entered everyday speech, and you may have chuckled at an online video or two in which children struggle adorably on hidden camera with the temptation of an immediate treat.

But the real reason the test is famous (and infamous) is because researchers have shown that the ability to wait — to delay gratification in order to get a bigger reward later — is associated with a range of positive life outcomes far down the line, including better stress tolerance and higher SAT scores more than a decade later. Whether or not it’s just this ability to wait or a host of other socioeconomic and personality factors that are predictive is still up for debate, but the  new study , published in the journal Psychological Science, shows that young children will wait nearly twice as long for a reward if they are told their teacher will find out how long they waited.

This is the first demonstration that what researchers call “reputation management” might be a factor.

“The classic marshmallow test has shaped the way researchers think about the development of self-control, which is an important skill,” said Gail Heyman, a University of California, San Diego professor of psychology and lead author on the study. “Our new research suggests that in addition to measuring self-control, the task may also be measuring another important skill: awareness of what other people value.”

The classic marshmallow test is featured in this online video. But what are we really seeing: Is it kids’ ability to exercise self-control or something else? Video by Igniter Media.

In fact, she said, “one reason for the predictive power of delay-of-gratification tasks may be that the children who wait longer care more about what people around them value, or are better at figuring it out.”

For their study, Heyman and her colleagues from UC San Diego and Zhejiang Sci-Tech University conducted two experiments with a total of 273 preschool children in China aged 3 to 4 years old.

The researchers told the children that they could earn a small reward immediately or wait for a bigger one. (Instead of a marshmallow, the researchers used a sticker reward in one of the experiments and a cookie in the other.) Children were assigned to either a “teacher condition” in which they were told that their teacher would find out how long they waited, a “peer condition” in which they were told that a classmate would find out how long they waited, or a “standard condition” that had no special instructions.

Children waited longer in both the teacher and peer conditions than in the standard condition. The difference was about twice as great in the teacher condition as compared to the peer condition. The researchers interpret these results to mean that when children decide how long to wait, they make a cost-benefit analysis that takes into account the possibility of getting a social reward in the form of a boost to their reputation. These findings suggest that the desire to impress others is strong and can motivate human behavior starting at a very young age.

The researchers were surprised by their findings because the traditional view is that 3- and 4-year-olds are too young to care what care what other people think of them.

“The children waited longer in the teacher and peer conditions even though no one directly told them that it’s good to wait longer,” said Heyman. “We believe that children are good at making these kinds of inferences because they are constantly on the lookout for cues about what people around them value. This may take the form of carefully listening to the evaluative comments that parents and teachers make, or noticing what kinds of people and topics are getting attention in the media.”

The study’s other co-authors are Fengling Ma, Dan Zeng and Fen Xu of Zhejiang Sci-Tech University and Brian J. Compton of UC San Diego.

The contributions of Fengling Ma were supported by grants from the National Natural Science Foundation of China (31400892), from the Natural Science Foundation of Zhejiang Province (LY17C090010) and from the China Scholarship Council.

Keep reading

a woman facing the camera

Persistent gaps for Black Californians would take over 248…

The State of Black California 2024 report analyzes two decades of socioeconomic status data.

Woman wearing a face mask and coughing into her hand. They sky behind her is hazy with smoke.

You know wildfire smoke is bad for you. But did you know it…

Breathing wildfire smoke harms our lungs — but the damage doesn't stop there. And UC research finds that wildfire smoke could be 10x more toxic than smoke from "everyday" sources like traffic and…

APS

A New Approach to the Marshmallow Test Yields Complicated Findings

  • Academic Achievement
  • Childhood Development
  • Delay of gratification

child experiment with marshmallow

The “marshmallow test” – the famed psychological experiment designed to measure children’s self-control – may not predict life outcomes as much as previously thought, a team of scientists has concluded from results of what they call a “conceptual replication” of the classic research.

The findings are published in Psychological Science . The experiment, led by Tyler W. Watts of New York University, took a modified approach to the test created by APS Past President Walter Mischel in the 1960s.

Unlike direct replications that aim to reproduce the precise methods used in an original study, Watts and his colleagues used a different approach to test the same underlying relationship between delay of gratification and long-term outcomes. The researchers obtained data from a larger and more diverse sample of children – in their study, children were considered to have delayed gratification if they waited 7 minutes to eat the marshmallow, while the participants in Mischel’s sample had been asked to wait up to 20 minutes.

Mischel’s original studies were designed simply to study children’s self-control. In the early tests, preschool children sat alone in a room with a single marshmallow placed on table in front of them. They were told that if they could resist the temptation to eat the marshmallow (or a cookie, pretzel, or other candy in subsequent versions) for a certain amount of time, they would receive two instead of one. Mischel and collaborators APS Fellow Yuichi Shoda and Philip Peake followed up with a subset of those children when they reached adolescence and found that those who had waited to get two marshmallows as children tended to have better academic achievement and life success compared to those who didn’t wait. Their work earned them a 2015 Golden Goose Award in recognition of their extensive contributions to the understanding of self-control.

Mischel and his colleagues have always said that tests with a larger sample of children might yield smaller effect sizes, and that the home environment could influence academic outcomes more than what the tests could show.

For the new study, Watts and colleagues examined longitudinal data from more than 900 children participating in the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development, a geographically diverse dataset widely used in the field of developmental research.

Much of their analysis focused on a subsample of children whose mothers had not completed college by the time the child was born. This subsample was more representative of the racial and economic makeup of the broader population of children in the United States compared with the original marshmallow experiments, though Hispanic children were still underrepresented, Watts and colleagues noted.

The results showed that, although children who were able to wait and resist temptation tended to have stronger math and reading skills in adolescence, the association was small and disappeared after the researchers controlled for characteristics of the child’s family and early environment. And there was no indication that the ability to delay gratification predicted later behaviors or measures of personality.

The authors concluded that interventions focused only on teaching young children to delay gratification are likely to be ineffective.

“Our findings suggest that an intervention that alters a child’s ability to delay, but fails to change more general cognitive and behavioral capacities, will probably have very small effects on later outcomes,” Watts explained. “If intervention developers hope to generate the kinds of improvements associated with the original marshmallow study, it is likely to be more fruitful to target the broader cognitive and behavioral abilities related to gratification delay.”

Watts, T.W., Duncan, G.J., and Quan, H. (2018). Revisiting the Marshmallow Test: A Conceptual Replication Investigating Links Between Early Delay of Gratification and Later Outcomes. Psychological Science , doi.org/10.1177/0956797618761661

See Walter Mischel discuss the history of the marshmallow test and other aspects of his storied career in an interview for Inside the Psychologist’s Studio .

child experiment with marshmallow

Interesting research. I would love to examine the data. From the top of my head…culture,trauma, experience…have a lot to do with delayed gratification versus immediate gratification. Not to mention how the researcher might have influenced the research and how participants perceived researchers.

“Much of their analysis focused on a subsample of children whose ***mothers had not completed college by the time the child was born.***”

That bit of information more likely than not translates into disadvantaged families. And that translates into a lot of other things that affect the outcome of the research.

child experiment with marshmallow

Social science studies are always going to have confounding variables that are impossible to control. In many ways, Mischel’s original study population, albeit a small sample size, was relatively homogeneous (white preschoolers enrolled at Stanford’s preschool in the 60’s). While the results may not be generalized, one could argue that the dependent variable is less likely to suffer from outside influences in this population and that the results apply within the group since the household factors for these children may have been very similar.

child experiment with marshmallow

I like Rob’s comment. I would like to see a study that looks at different homogenous groups and explores whether more self-control within each homogenous group leads to better success later in life. For example, do kids from a disadvantaged background with more self-control do better later in life than other disadvantaged kids with less self-control?

My own experience says that self-control is likely to impact success in life.

APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines .

Please login with your APS account to comment.

child experiment with marshmallow

Cattell Fund Projects Include Research on Children’s Executive Function, Empathy Choice, and More

The James McKeen Cattell Fund has recognized APS Fellow Stephanie M. Carlson, C. Daryl Cameron, Robert Hampton, and Kevin Holmes as recipients of its Sabbatical Fund Fellowship for 2023–2024.

child experiment with marshmallow

Teaching: Adolescent Self-Control / Loyalty Benefits and Backfires

Lesson plans about self-control in adolescents and how loyalty can lead us to act ethically or unethically.

child experiment with marshmallow

Embracing Discomfort Can Open Our Minds to New Ideas 

When trying something new, discomfort might feel like a sign we’re in over our heads. Embracing these feelings as a part of learning could help motivate personal growth.

Privacy Overview

CookieDurationDescription
__cf_bm30 minutesThis cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
CookieDurationDescription
AWSELBCORS5 minutesThis cookie is used by Elastic Load Balancing from Amazon Web Services to effectively balance load on the servers.
CookieDurationDescription
at-randneverAddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT2 yearsYouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
uvc1 year 27 daysSet by addthis.com to determine the usage of addthis.com service.
_ga2 yearsThe _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_3507334_11 minuteSet by Google to distinguish users.
_gid1 dayInstalled by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CookieDurationDescription
loc1 year 27 daysAddThis sets this geolocation cookie to help understand the location of users who share the information.
VISITOR_INFO1_LIVE5 months 27 daysA cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSCsessionYSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devicesneverYouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-idneverYouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextIdneverThis cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requestsneverThis cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

IMAGES

  1. Easy Science Experiments and Fun with Marshmallows

    child experiment with marshmallow

  2. Marshmallow Science Experiment for Preschool and Kindergarten

    child experiment with marshmallow

  3. Marshmallow Science Experiment for Preschool and Kindergarten

    child experiment with marshmallow

  4. Marshmallow Science Experiment for Preschool and Kindergarten

    child experiment with marshmallow

  5. Rainbow Marshmallow Science Experiment for Kids

    child experiment with marshmallow

  6. Rainbow Marshmallow Science Experiment for Kids

    child experiment with marshmallow

VIDEO

  1. I have been practicing marshmallows for three years at an early age

  2. Kids Marshmallow Experiment

  3. Exploring the Stanford Marshmallow Experiment: Delayed Gratification and Future Success

  4. The Marshmallow Experiment

  5. They put the marshmallow in the microwave

  6. The Temptation Test! Can Resisting Sweets Lead to Success?

COMMENTS

  1. Stanford marshmallow experiment

    A classic study on delayed gratification in children, conducted by Walter Mischel and others at Stanford University in 1970. The experiment involved offering a child a choice between one small reward or two if they waited for 15 minutes.

  2. Stanford Marshmallow Test Experiment

    Learn how the marshmallow test measures a child's ability to delay gratification and how it relates to later life outcomes. Find out the original and replication studies, the experimental conditions, and the results of the Stanford experiments.

  3. The Marshmallow Experiment and the Power of Delayed Gratification

    The Marshmallow Experiment tested children's ability to wait for a second marshmallow instead of eating one immediately. The study found that those who delayed gratification had higher success in life, but the environment also influenced their choices.

  4. The Marshmallow Test: Delayed Gratification in Children

    The marshmallow test is a classic experiment that measures children's ability to resist an immediate reward for a larger one. Learn how the test was created, what factors influence children's decisions, and how the test relates to future outcomes.

  5. The Stanford Marshmallow Experiment: How Self-Control Affects Success

    The Stanford marshmallow experiment was a study that tested children's ability to delay gratification by offering them a snack and a reward. It found that self-control predicts various positive outcomes later in life, such as academic success and health.

  6. The Marshmallow Test: What Does It Really Measure?

    June 1, 2018. The marshmallow test is one of the most famous pieces of social-science research: Put a marshmallow in front of a child, tell her that she can have a second one if she can go 15 ...

  7. Science Behind Kids and Marshmallows

    by Michaeleen Doucleff, NPRIn the the 1960s, a Stanford psychologist ran an experiment to study children's self-control.It's called the marshmallow test. And it's super simple.Kids ages 3 to 5 choose a treat — an Oreo cookie, a pretzel stick or a marshmallow. Then researchers give the child brief instructions: You can eat the treat now, but if you can wait for me to return, you'll get two ...

  8. Does the "Marshmallow Test" Really Predict Success?

    A recent replication of the classic Marshmallow Experiment found that delay of gratification at age 5 did not predict later success or performance, after controlling for other factors. The ...

  9. PDF The Marshmallow Experiment

    The Marshmallow Experiment Original research done at Sandford in the 1960s and 1970s. 3-4 year old children were placed in a room alone at a table and given one marshmallow -and were told if they waited 15 minutes they'd be given a second marshmallow, and they could eat it now. The purpose was to measure the capacity for delayed gratification.

  10. What the Marshmallow Test Really Teaches About Self-Control

    It began in the early 1960s at Stanford University's Bing Nursery School, where Mischel and his graduate students gave children the choice between one reward (like a marshmallow, pretzel, or ...

  11. Revisiting the Marshmallow Test: A Conceptual Replication Investigating

    In a series of studies based on children who attended a preschool on the Stanford University campus, Mischel, Shoda, and colleagues showed that under certain conditions, a child's success in delaying the gratification of eating marshmallows or a similar treat was related to later cognitive and social development, health, and even brain structure (Casey et al., 2011; Mischel et al., 2010 ...

  12. The Marshmallow Test for Grownups

    Learn how the Stanford marshmallow test, a classic experiment in developmental psychology, can help you improve your personal productivity and willpower. Ed Batista, an executive coach and ...

  13. The Bing "Marshmallow Studies": 50 Years of Continuing Research

    Learn how Walter Mischel's groundbreaking research at Bing in the 1960s and 1970s revealed the secrets of self-control and delayed gratification in preschool children. Find out how the Bing studies influenced education, psychology and brain science, and how they are still relevant today.

  14. The Stanford Marshmallow Test

    Learn about the famous experiment that tested self-control and success in children, and how it influenced psychology. Find out the results, the criticisms, and the impact of the Marshmallow Test.

  15. Acing the marshmallow test

    Walter Mischel, PhD, is a psychologist who has studied how children and adults can resist temptation and achieve their goals. He explains his research on the marshmallow test, its implications for education and mental health, and his new book on self-control strategies.

  16. Kids Do Better on the Marshmallow Test When They Cooperate

    A study shows that children can delay gratification longer when they are working together toward a common goal, even if they have a lower chance of getting a second cookie. The findings suggest that cooperation is motivating and beneficial for children's self-control and social development.

  17. The surprising thing the 'marshmallow test' reveals about kids in an

    Pioneered in the 1960s by a young Stanford psychology professor named Walter Mischel, the marshmallow test left a child between the ages of 3 and 5 alone in a room with two identical plates, each ...

  18. The Marshmallow Test Gets More Complicated

    Kidd and her colleagues started their experiment by adding a step before giving their group of 28 three- to five-year-old children the marshmallow test: Similar to the marshmallow test, the ...

  19. The marshmallow test, revisited

    Children who wait longer for a bigger reward in the marshmallow test may care more about what others think of them, a new study suggests. The research shows that 3- and 4-year-olds can make cost-benefit analyses based on social cues and values.

  20. A New Approach to the Marshmallow Test Yields Complicated Findings

    A new study by Watts and colleagues found that children's ability to delay gratification did not predict their academic or life success, unlike the original marshmallow test by Mischel and Shoda. The authors suggest that interventions should target broader cognitive and behavioral capacities rather than delay of gratification alone.

  21. The Marshmallow Experiment

    Watch how children react to the famous Stanford experiment that tests their ability to delay gratification. See the hilarious results of hiding two cameras in the room and rewarding them with ...

  22. PARENTING SCIENCE: The marshmallow test

    What is the marshmallow test and what does it reveal about children's self-control and future success? Watch this video to find out.

  23. PDF Stanford Marshmallow Experiment

    Stanford University's Experiment: 1st Round. Carried out by Dr. Walter Mischel's team of professors (End of 1960 - 1970s) It became well-known as the Stanford Marshmallow Experiment Stanford University affiliated Bing Nursery School Conducted with 653 children between the ages of 4 to 6 Published research results in 1981.