( = 552)
Note: In the columns for children with degreed and nondegreed mothers, the table reports the proportion of students falling within each delay-of-gratification category; all other values in these columns are means (with standard deviations in parentheses). The sample was split on the basis of mother’s education, and p values were derived from a series of regressions in which each characteristic was regressed on a dummy for whether mother graduated from college and a series of site fixed effects. Beta values represent effect sizes measuring the standardized differences between the two groups.
We adopted several approaches to dealing with this truncation problem, principally exploring possible nonlinearities in the associations between time waited and outcome measures by dividing the distribution of waiting times into discrete intervals. We also focused much of our analyses on the children of mothers who did not complete college, as far fewer of the children in this sample hit the ceiling on the minutes-waited measure, and as explained above, this group of children complements the sample of children included in the Mischel and Shoda studies. But because the subsample of children with college-educated mothers allows for a more direct replication of Mischel and Shoda’s famous work (e.g., Shoda et al., 1990 ), we also present results for them, bearing in mind the limitations imposed by the substantial delay truncation.
Finally, it should also be noted that children in the NICHD study were given only the version of the task that Shoda and colleagues (1990) called the diagnostic condition (i.e., the children were not offered strategies and were able to see the treat as they waited).
Academic achievement was measured using the Woodcock-Johnson Psycho-Educational Battery Revised (WJ-R) test ( Woodcock, McGrew, & Mather, 2001 ), a commonly used measure of cognitive ability and achievement (e.g., Watts, Duncan, Siegler, & Davis-Kean, 2014 ). For math achievement at Grade 1 and age 15, we used the Applied Problems subtest, which measured children’s mathematical problem solving. At Grade 1, reading achievement was measured using the Letter-Word Identification task, a measure of word recognition and vocabulary, and at age 15, reading ability was measured using the Passage Comprehension test. The Passage Comprehension test asked students to read various pieces of text silently and then answer questions about their content.
For all the WJ-R tests, we used the standard scores, which were normed to have a mean of 100 and a standard deviation of 15 in each respective wave. We took the average of the Grade 1 math and reading measures and the age-15 math and reading measures, respectively, to create composite measures of academic achievement.
Following Shoda et al. (1990) , we relied primarily on mothers’ reports of child behavior. Mother-reported internalizing and externalizing behavioral problems were assessed using the Child Behavior Checklist (CBCL; Achenbach, 1991 ) at age 54 months, Grade 1, and age 15. The CBCL is a widely used measure of behavioral problems, and it includes approximately 100 items rated on 3-point scales that capture aspects of internalizing (i.e., depressive) and externalizing (i.e., antisocial) behavior. As with academic achievement, at Grade 1 and age 15, we averaged together the externalizing and internalizing measures to create a behavioral composite score that, before standardization, ranged from 32 to 83, with higher scores indicating higher levels of behavioral problems. We also tested models that used a host of alternative behavioral measures taken from youth reports and direct assessments at age 15; these measures and models are described in the Supplemental Material .
All covariates included in our models are listed in Table 3 , and we grouped the covariates into two distinct sets of control variables: child background and Home Observation for Measurement of the Environment (HOME) controls and concurrent 54-month controls .
Descriptive Characteristics of All Control Variables
Variable | Children of nondegreed mothers | Children of degreed mothers | ||||||
---|---|---|---|---|---|---|---|---|
Waited 7 min ( = 251) | Did not wait 7 min ( = 301) | β | value for difference | Waited 7 min ( = 250) | Did not wait 7 min ( = 116) | β | value for difference | |
Child background and HOME controls | ||||||||
Child background | ||||||||
Proportion male | .47 | .51 | −0.04 | .338 | .45 | .50 | −0.05 | .409 |
Proportion White | .82 | .64 | 0.18 | .001 | .94 | .85 | 0.10 | .007 |
Proportion Black | .07 | .24 | −0.15 | .001 | .00 | .05 | −0.05 | .024 |
Proportion Hispanic | .06 | .07 | −0.01 | .545 | .03 | .03 | −0.00 | .962 |
Proportion other race/ethnicity | .04 | .05 | −0.01 | .530 | .03 | .07 | −0.05 | .058 |
Child’s age at delay measure (months) | 56.11 (1.11) | 56.01 (1.14) | 0.13 | .105 | 55.99 (1.13) | 55.99 (1.15) | 0.07 | .519 |
Birth weight (g) | 3490.23 (478.56) | 3449.02 (540.26) | 0.09 | .320 | 3516.63 (520.52) | 3572.53 (527.17) | −0.13 | .268 |
BBCS standard score (36 months) | 9.06 (2.56) | 7.67 (2.86) | 0.47 | .001 | 10.67 (2.20) | 10.14 (2.35) | 0.19 | .043 |
Bayley MDI (24 months) | 93.89 (12.40) | 85.91 (14.40) | 0.53 | .001 | 100.88 (11.78) | 95.21 (14.10) | 0.41 | .001 |
Child temperament (6 months) | 3.18 (0.42) | 3.25 (0.38) | −0.17 | .053 | 3.13 (0.37) | 3.09 (0.43) | 0.07 | .531 |
Log of family income (1–54 months) | 0.89 (0.61) | 0.57 (0.73) | 0.38 | .001 | 1.54 (0.51) | 1.42 (0.56) | 0.14 | .057 |
Mother’s age at birth (years) | 27.75 (5.66) | 26.07 (5.46) | 0.29 | .001 | 31.58 (4.05) | 31.87 (3.91) | −0.06 | .438 |
Mother’s education (years) | 13.00 (1.41) | 12.68 (1.50) | 0.12 | .017 | 17.02 (1.31) | 16.82 (1.26) | 0.07 | .234 |
Mother’s PPVT score | 96.43 (13.38) | 90.47 (17.03) | 0.30 | .001 | 114.10 (15.62) | 105.63 (16.51) | 0.44 | .001 |
HOME score (36 months) | ||||||||
Learning Materials | 7.20 (2.36) | 5.86 (2.51) | 0.53 | .001 | 8.64 (1.59) | 8.41 (2.20) | 0.12 | .168 |
Language Stimulation | 6.13 (1.04) | 5.67 (1.24) | 0.46 | .001 | 6.38 (0.84) | 6.17 (1.13) | 0.21 | .046 |
Physical Environment | 6.16 (1.04) | 5.64 (1.54) | 0.40 | .001 | 6.35 (0.83) | 6.33 (0.91) | 0.07 | .372 |
Responsivity | 5.67 (1.28) | 5.17 (1.52) | 0.31 | .001 | 6.09 (0.99) | 5.81 (1.30) | 0.21 | .033 |
Academic Stimulation | 3.43 (1.21) | 2.97 (1.29) | 0.38 | .001 | 3.74 (0.97) | 3.57 (1.29) | 0.17 | .112 |
Modeling | 3.13 (1.10) | 2.82 (1.14) | 0.29 | .001 | 3.64 (0.93) | 3.51 (1.04) | 0.11 | .285 |
Variety | 6.80 (1.34) | 6.14 (1.50) | 0.45 | .001 | 7.54 (1.17) | 7.29 (1.36) | 0.17 | .088 |
Acceptance | 3.39 (0.85) | 3.22 (1.04) | 0.18 | .038 | 3.70 (0.59) | 3.57 (0.82) | 0.13 | .162 |
Responsivity-Empirical Scale | 5.54 (0.91) | 5.14 (1.29) | 0.37 | .001 | 5.77 (0.52) | 5.55 (0.91) | 0.21 | .026 |
Concurrent 54-month controls | ||||||||
54-month WJ-R score | ||||||||
Letter-Word Identification | 99.03 (11.98) | 93.22 (12.63) | 0.42 | .001 | 105.93 (12.19) | 102.31 (11.94) | 0.26 | .011 |
Applied Problems | 104.80 (12.88) | 95.67 (15.72) | 0.57 | .001 | 112.36 (12.13) | 106.06 (12.31) | 0.40 | .001 |
Picture Vocabulary | 100.54 (13.07) | 93.74 (13.80) | 0.43 | .001 | 109.11 (13.45) | 103.47 (13.58) | 0.36 | .001 |
Memory for Sentences | 93.21 (15.59) | 85.43 (17.67) | 0.43 | .001 | 100.99 (18.73) | 92.34 (17.45) | 0.49 | .001 |
Incomplete Words | 98.08 (12.91) | 92.72 (13.52) | 0.41 | .001 | 102.18 (11.69) | 98.05 (11.98) | 0.35 | .001 |
54-month Child Behavior Checklist | ||||||||
Internalizing | 47.36 (9.11) | 47.94 (8.51) | −0.06 | .477 | 46.55 (8.84) | 46.81 (8.17) | −0.01 | .988 |
Externalizing | 51.14 (9.34) | 53.09 (9.84) | −0.21 | .020 | 50.44 (9.11) | 50.99 (8.53) | −0.06 | .604 |
Note: In the columns for children who did and did not wait 7 min, the table reports proportions for race/ethnicity; all other values in these columns are means (with standard deviations in parentheses). The p value column compares children who successfully completed the task and waited 7 min with children who did not, and the betas represent effect sizes measuring the standardized differences between the two groups. A series of regressions in which each variable was regressed on a dummy indicating whether the child completed the marshmallow test was used to generated p values, and a series of site dummy variables was also included to adjust for site differences ( p s below .001 have been rounded to .001). BBCS = Bracken Basic Concept Scale; HOME = Home Observation for Measurement of the Environment; MDI = Mental Development Index; PPVT = Peabody Picture Vocabulary Test; WJ-R = Woodcock-Johnson Psycho-Educational Battery Revised.
Child demographic characteristics (i.e., gender and race), birth weight, mother’s age at the child’s birth, and mother’s level of education were collected at the 1-month interview via interview with study mothers. Family income was collected from study mothers at the 1-, 6-, 15-, 24-, 36- and 54-month interviews. We took the average of all nonmissing income data over this span, and then log-transformed average family income to restrict the influence of outliers. Mother’s Peabody Picture Vocabulary Test (PPVT) score was assessed in a lab visit when the focal child was 36 months old. The PPVT is a commonly used measure of intelligence (e.g., see meta-analysis by Protzko, 2015 ).
We also included early indicators of child cognitive functioning, as measured at age 24 months by the Bayley Mental Development Index (MDI; Bayley, 1991 ) and at age 36 months by the Bracken Basic Concept Scale (BBCS; Bracken, 1984 ). The MDI measured children’s sensory-perceptual abilities, as well as their memory, problem solving, and verbal communication skills. The BBCS was an early measure of school readiness skills, and it required students to identify basic letters and numbers.
Child temperament was measured at age 6 months using the Early Infant Temperament Questionnaire ( Medoff-Cooper, Carey, & McDevitt, 1993 ), a 38-item survey to which mothers responded. This questionnaire asked mothers to rate their child on a 6-point Likert-scale with items focused on the child’s mood, adaptability, and intensity. We took the average score across these items as our measurement of temperament, with higher scores indicating more agreeable dispositions.
Finally, the set of controls measured prior to age 54 months also included indicators of the quality of the home environment, as measured by an observational assessment called the HOME inventory ( Caldwell & Bradley, 1984 ). The HOME was assessed when the focal child was approximately 36 months old, and it was designed to capture aspects of the home environment known to support positive cognitive, emotional, and behavioral functioning. We used nine subscales of the HOME in our models: The first eight subscales are commonly used with the HOME measure (Learning Materials, Language Stimulation, Physical Environment, Responsivity, Academic Stimulation, Modeling, Variety, and Acceptance), and the ninth subscale, called the Responsivity-Empirical Scale, was derived by the NICHD SECCYD study from factor analyses of the HOME items. This final scale was distinct from the traditional Responsivity scale, as it included items from the Language Stimulation scale that also measured mother responsivity and sensitivity to the child.
For models that included controls for concurrent cognitive and behavioral skills, we also included subscales taken from the age 54-month WJ-R test. As our measure of early reading, we included the Letter-Word Identification task, which tested children’s ability to sound out simple words, and the Applied Problems test at age 54 months was our measure of early math skills. For preschool children, the Applied Problems test requires them to count and solve simple addition problems. We also used the Memory for Sentences and Incomplete Words subtests as measures of cognitive ability. The Incomplete Words test measured auditory closure and processing, and children listened to an audio recording where words missing a phoneme were listed. They were then asked to name the complete word. Finally, the Picture Vocabulary test was a measure of verbal comprehension and crystallized intelligence. In this task, children were asked to name pictured objects. All of these tasks have been widely used as measures of children’s early cognitive skills and their measurement properties have been widely reported (e.g., Watts et al., 2014 ).
Finally, we also included the mother’s report of children’s externalizing and internalizing problems from the Child Behavior Checklist at age 54 months. Much like the measure used for age-15 behavioral problems, the 54-month survey included a battery of items designed to assess children’s antisocial and disruptive behavior (i.e., externalizing) and depressive symptoms (i.e., internalizing).
Our primary goal was to estimate the association between early gratification delay and long-run measures of academic achievement and behavioral functioning. Like the work of Shoda and colleagues (1990) , our study did not include a measure of gratification delay in which between-child differences were generated from some exogenous intervention, so we do not claim that the associations we estimated reflect causal impacts. Instead, our goal was to assess how much bias might be contained in longitudinal bivariate correlations between gratification delay and later outcomes as a result of failure to control for characteristics of children and their environments. Regression-adjusted correlations should provide better guidance regarding whether interventions boosting gratification delay might also improve later achievement and behavior.
To accomplish our analytic goals, we modeled later academic achievement and behavior (measured at both Grade 1 and age 15) as a function of a measure of gratification delay at age 54 months. We then tested models that added controls for background characteristics and measures of the home environment before moving to models that also included measures of cognitive and behavioral skills assessed at age 54 months (see Table 3 ).
These two approaches reflect different assumptions regarding how variation in gratification-delay ability might arise. Models with controls measured between birth and age 36 months still allow for variation in age 54-months gratification delay caused by the differential development of general cognitive or behavioral skills (e.g., executive function, self-control) between 36 and 54 months. Put another way, these models contain controls only for factors that even ambitious preschool-child-focused interventions are unlikely to alter (e.g., birth weight, temperament at 6 months of age, early home environment).
In contrast, the models with concurrent-54-months covariates controlled for variation in a range of cognitive capacities and behavioral problems developed by age 54 months. They helped to isolate the possible effects of an intervention that targets only the narrow set of skills involved with gratification delay (e.g., a program that merely provided children with strategies to help them delay longer; see Mischel, 2014 , p. 40) but not concurrent general cognitive ability or socioemotional behaviors.
Although it is impossible to know exactly how individual differences in gratification delay emerge (e.g., changes in parenting, development of cognitive skills), by controlling for factors unlikely to be altered by interventions (e.g., ethnicity, parental background), we can purge our estimates of bias due to observable characteristics that are correlated with gratification delay and later outcomes. If remaining unobserved factors also contribute to gratification delay and later outcomes (e.g., changes in parenting), and if these unobserved factors are unlikely to be altered by a particular intervention, then bias in our estimates may still remain. Yet our estimates should serve as an improvement over the unadjusted correlations reported previously (e.g., Shoda et al., 1990 ).
In all models shown, continuous variables were standardized so that coefficients could be read as effect sizes, and all models with control variables included a set of dummy variables for each site to adjust for any between-site differences. In order to account for missing data on control variables, we used structural equation modeling with full information maximum likelihood in Stata Version 15.0 ( StataCorp, 2017 ) to estimate all analytic models. Finally, we report all estimated p values to the thousandth decimal place (with p values below .001 displayed as < .001), and we describe any estimate corresponding to a p value less than .05 as statistically significant. Though we recognize the arbitrariness of focusing only on results with a p value less than .05, we selected this alpha level because it was the minimum threshold for statistical significance used in the studies we attempted to replicate and extend (i.e., Mischel et al., 1988 ; Mischel et al., 1989 ; Shoda et al., 1990 ). Consequently, any differences in conclusions reached between our studies and those reported in the previous literature should be attributed to design and sample differences rather than alpha-level choices.
Table 2 provides descriptive results for key analysis variables, including the 54-months delay-of-gratification measure, split by mother’s education level. In the sample of children with nondegreed mothers, children waited an average of 3.99 min ( SD = 3.08) before ending the task. We also present the proportion of children falling within certain ranges on the measure, with the 7-min category representing children who successfully completed the trial. In the lower-SES sample, 45% of children waited the maximum of 7 min, and 23% waited less than 20 s (i.e., 0.33 min). In the higher-SES sample, only 10% of children waited less than 20 s, and the average time waited was 5.38 min (statistically significantly longer than the lower-SES group, p < .001).
Because the 7-min ceiling presented a potential analytic challenge for both samples, we estimated models that substituted the four dummy categories shown in Table 2 for the continuous minutes-waited variable as a way to assess nonlinearities in the relationship between delay time and academic and socioemotional outcomes. Importantly, these models also provide information on how much our analysis might be compromised by the 7-min truncation.
Table 3 presents descriptive information for the various control measures used in the analysis, and means are presented separately for children who did and did not complete the delay task. In both the higher- and lower-SES samples, performance on the delay-of-gratification task was highly correlated with differences on most observable characteristics considered. For example, for children from nondegreed mothers, those who completed the delay-of-gratification task were from higher income families ( p < .001) than noncompleters, had mothers with higher PPVT scores ( p < .001), and had higher scores on dimensions of the HOME observational assessment ( p s = .04 to < .001). Null or smaller differences were generally observed for the children of degreed mothers, perhaps owing to the lack of heterogeneity in this subsample.
Results for children of nondegreed mothers.
Table 4 presents coefficients and standard errors from models that estimated the association between delay of gratification at 54 months and our Grade 1 and age-15 achievement and behavioral composites for the sample of children from nondegreed mothers. Table 4 displays the results for a standardized continuous measure of gratification delay (i.e., the number of minutes waited during the marshmallow test). As Column 1 reflects, the bivariate association between minutes waited and academic achievement was 0.28 ( SE = 0.04, p < .001), considerably less than the .57 correlation Shoda and colleagues found for SAT math scores and the .42 correlation they found for verbal scores. These linear results suggest that children’s Grade 1 achievement would improve by approximately one tenth of a standard deviation for every additional minute waited at age 4. When the controls measured prior to age 54 months (second column of Table 3 ) were added to the model, the standardized association fell to 0.10 ( SE = 0.03, p = .002), and when concurrent 54-months controls were added (third column of Table 1 ), the association fell to a statistically nonsignificant 0.05 ( SE = 0.03, p = .114).
Associations Between Delay of Gratification at Age 54 Months and Later Measures of Academic Achievement and Behavior for Children of Mothers Without College Degrees
Variable | Achievement composite | Behavior composite | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Grade 1 | Age 15 | Grade 1 | Age 15 | |||||||||
(1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | (11) | (12) | |
Delay minutes (continuous) | 0.279 (0.038) | 0.102 (0.033) | 0.047 (0.030) | 0.236 (0.037) | 0.081 (0.034) | 0.050 (0.032) | −0.060 (0.043) | −0.015 (0.044) | 0.023 (0.044) | −0.062 (0.046) | −0.026 (0.047) | 0.003 (0.042) |
Delay minutes (categorical) | ||||||||||||
< 0.333 min | ref | ref | ref | ref | ref | ref | ref | ref | ref | ref | ref | ref |
0.333–2 min | 0.298 (0.126) | 0.189 (0.105) | 0.127 (0.093) | 0.353 (0.122) | 0.230 (0.103) | 0.178 (0.098) | 0.055 (0.144) | 0.090 (0.138) | 0.079 (0.105) | −0.140 (0.152) | −0.071 (0.148) | −0.106 (0.132) |
2–7 min | 0.424 (0.126) | 0.206 (0.104) | 0.041 (0.093) | 0.457 (0.123) | 0.300 (0.103) | 0.235 (0.099) | −0.088 (0.144) | −0.020 (0.137) | 0.039 (0.106) | −0.182 (0.151) | −0.109 (0.145) | −0.053 (0.131) |
7 min | 0.720 (0.098) | 0.284 (0.086) | 0.141 (0.078) | 0.646 (0.098) | 0.234 (0.088) | 0.150 (0.084) | −0.121 (0.112) | −0.007 (0.114) | 0.072 (0.087) | −0.193 (0.120) | −0.095 (0.123) | −0.048 (0.111) |
value for test of equality of all categories | .001 | .012 | .247 | .001 | .015 | .093 | .477 | .866 | .837 | .428 | .861 | .885 |
value for test of equality of second, third, and fourth categories | .001 | .563 | .475 | .015 | .752 | .630 | .382 | .700 | .923 | .927 | .969 | .882 |
Control variables included | ||||||||||||
Child background and HOME | No | Yes | Yes | No | Yes | Yes | No | Yes | Yes | No | Yes | Yes |
Concurrent 54 month | No | No | Yes | No | No | Yes | No | No | Yes | No | No | Yes |
Note: n = 552. For the continuous and categorical measures of delay minutes, the table gives standardized coefficients (with standard errors in parentheses). For the categorical measure, < 0.333 min was the reference category. Because outcome variables were standardized, coefficients can be interpreted as effect sizes. Estimates shown in the first column of each set (i.e., Columns 1, 4, 7, and 10) contained only the measure of delay of gratification and a given outcome measure. Estimates shown in the second column of each set (i.e., Columns 2, 5, 8, and 11) added child background characteristics, Home Observation for Measurement of the Environment (HOME) scores, and site dummy variables. Estimates shown in the third column of each set (i.e., Columns 3, 6, 9, and 12) added other behavioral and cognitive measures also measured at age 54 months. Post hoc chi-square tests were used to generate p values in order to assess whether respective sets of variables were different from one another ( p s below .001 have been rounded to .001).
Columns 4 through 6 show analogous models for the measure of achievement at age 15. The magnitudes of the age-15 correlations were remarkably similar to the Grade 1 correlations. The age-15 achievement correlation in the absence of other controls was of moderate size and statistically significant, β = 0.24, SE = 0.04, p < .001; but fell substantially when controls for earlier characteristics were added, β = 0.08, SE = 0.03, p = .016; and became nonsignificant when 54-months controls were added, β = 0.05, SE = 0.03, p = .140. Given that Shoda and colleagues found almost as strong correlations with later behavior as with later achievement, we were surprised to find virtually no relationship—even in the absence of controls—between delay of gratification and the composite score of mother-reported internalizing and externalizing at either Grade 1 or age 15 (right half of Table 4 ).
Children who waited less than 20 s (i.e., the lowest category) served as the comparison group for our models that represented delay times in a set of dummy variables (see Table 2 for the proportion of students in each category). As shown in Table 4 , models of outcomes at both Grade 1 and age 15 that lack control variables show a strong gradient between gratification delay and later achievement. Relative to children who waited less than 20 s, children who waited between 20 s and 2 min scored about one third of a standard deviation higher on the achievement measure at Grade 1 and age 15, and this difference grew to nearly three fourth of a standard deviation for the group that waited the entire 7 min. The entry for Model 1 in the row labeled “ p value for test of equality of second, third, and fourth categories” shows that the coefficients produced by the three groups of children who waited longer than 20 s differed significantly from one another ( p < .001), as did coefficient differences across all four categorical variables (the p value that is shown in the row labeled “ p value for test of equality of all categories”).
At both Grade 1 and age 15, when controls for early child and family characteristics were added to the model (Column 2 for Grade 1; Column 5 for age 15), the coefficients estimated for all three delay-time groups fell by roughly 50%. Surprisingly, the addition of the background controls also flattened out the gradient of the prediction across the gratification-delay distribution. Relative to the less-than-20-s reference group, achievement differences for children who waited more than 20 s but not the full 7 min were strikingly similar to the difference for children who waited the full 7 min. At age 15, the threshold nature of the relationship was most apparent; the coefficients produced by the three groups that waited longer than 20 s all fell between 0.23 and 0.30, and were not close to being statistically significantly different from one another ( p = .752).
When concurrent 54-months controls were added, coefficients fell even further. At age 15, only the coefficient produced by the group describing children who waited 2 to 7 min retained statistical significance (β = 0.24, SE = 0.10, p = .018), though once again the set of coefficients on the included categories of delay time did not differ from one another ( p = .630). As with the models shown for delay minutes in the achievement-composite columns in Table 4 , we found no statistically significant relationships between gratification delay and the first-grade and age-15 behavioral composites.
In our focal case of age-15 achievement, the return for delaying gratification appeared to be driven by differences between children who managed to wait at least 20 s and those who did not. Figure 1 illustrates this threshold effect with three lines showing the coefficients produced by our delay-of-gratification categories in the age-15 achievement models (i.e., the “Delay minutes (categorical)” section of Table 4 ). The solid line shows coefficients drawn from the no-control model (i.e., Column 4 of Table 4 ), the dashed line shows coefficients from the model with early controls (i.e., Column 5 of Table 4 ), and the dotted line shows coefficients produced by models with the 54-months controls (i.e., Column 6 of Table 4 ).
Predicted achievement score by minutes of delay for children of mothers with no college degree. Error bars represent 95% confidence intervals. Values are shown separately for each of the four delay-of-gratification groups (< 0.333 min, 0.333–2 min, 2–7 min, 7 min); the x -axis shows the deviation in achievement composite scores from the reference group (delay < 0.333 min) against the within-group average amount of time waited. The average wait times for the models with no controls and with child background and Home Observation for Measurement of the Environment (HOME) controls only are displaced by ±.025 to distinguish the sets of error bars. The high-delay group’s coefficients are plotted at 7 min, although the 7-min truncation prevents us from knowing what the mean value of minutes waited would have been for this group in the absence of this limit.
The uncontrolled line has a steep initial jump, followed by a more gradual increase for wait times longer than 20 s. Both lines for the models with controls decrease after 4 min. Using 7 min to anchor the more-than-7-min group is probably an underestimate, but it is clear from the downward trajectory that no assumptions about the distribution of wait times above 7 min would produce a strong positive slope for the last segment of the line. Thus, in the case of children with mothers who lack college degrees, the truncation of delay time at 7 min does not affect the conclusion that children with the highest delay times show similar achievement levels at age 15 as other children who are able to delay for at least 20 s.
In Table 5 , we present key results for children of mothers with college degrees. As in Table 4 , we again present results for the continuous measure of delay of gratification and the categorical measures split along parts of the delay-of-gratification distribution. For the continuous measure, we again found evidence of positive unadjusted associations between delay of gratification and later achievement at both Grade 1 (β = 0.18, SE = 0.06, p = .001) and age 15 (β = 0.17, SE = 0.06, p = .007), and the categorical results suggested that much of this association was somewhat linear through the distribution. For the age-15 models, these relations became statistically indistinguishable from zero once controls were added, and the point estimate for the more-than-7-min category was surprisingly small and negative (β = −0.04, SE = 0.15, p = .816). As with the models shown in Table 4 , we again found no evidence of associations between delay of gratification and the behavioral measures at first grade or age 15 in the high-SES sample.
Associations Between Delay of Gratification at Age 54 Months and Later Measures of Academic Achievement and Behavior for Children of Mothers With College Degrees
Variable | Achievement composite | Behavior composite | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Grade 1 | Age 15 | Grade 1 | Age 15 | |||||||||
(1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | (11) | (12) | |
Delay minutes (continuous) | 0.178 (0.056) | 0.120 (0.053) | 0.048 (0.045) | 0.167 (0.062) | 0.062 (0.059) | 0.007 (0.054) | −0.049 (0.057) | −0.059 (0.061) | −0.050 (0.046) | 0.031 (0.059) | 0.038 (0.063) | 0.043 (0.055) |
Delay minutes (categorical) | ||||||||||||
< 0.333 min | ref | ref | ref | ref | ref | ref | ref | ref | ref | ref | ref | ref |
0.333–2 min | 0.327 (0.220) | 0.039 (0.198) | 0.148 (0.168) | 0.079 (0.245) | −0.131 (0.216) | −0.085 (0.197) | −0.069 (0.227) | −0.088 (0.228) | −0.184 (0.173) | −0.065 (0.231) | 0.027 (0.232) | −0.083 (0.200) |
2–7 min | 0.397 (0.206) | 0.147 (0.184) | 0.134 (0.155) | 0.216 (0.227) | 0.028 (0.199) | −0.032 (0.182) | −0.277 (0.210) | −0.240 (0.209) | −0.265 (0.157) | −0.318 (0.218) | −0.217 (0.216) | −0.227 (0.185) |
7 min | 0.562 (0.166) | 0.301 (0.154) | 0.193 (0.131) | 0.404 (0.183) | 0.077 (0.166) | −0.036 (0.152) | −0.194 (0.168) | −0.208 (0.174) | −0.214 (0.131) | −0.007 (0.174) | 0.068 (0.180) | 0.052 (0.155) |
value for test of equality of all categories | .005 | .100 | .521 | .059 | .674 | .979 | .515 | .584 | .350 | .267 | .367 | .227 |
value for test of equality of second, third, and fourth categories | .238 | .153 | .843 | .149 | .477 | .948 | .629 | .753 | .867 | .147 | .206 | .115 |
Control variables included | ||||||||||||
Child background and HOME | No | Yes | Yes | No | Yes | Yes | No | Yes | Yes | No | Yes | Yes |
Concurrent 54 month | No | No | Yes | No | No | Yes | No | No | Yes | No | No | Yes |
Note: n = 366. For the continuous and categorical measures of delay minutes, the table gives standardized coefficients (with standard errors in parentheses). For the categorical measure, < 0.333 min was the reference category. Because outcome variables were standardized, coefficients can be interpreted as effect sizes. Estimates shown in the first column of each set (i.e., Columns 1, 4, 7, and 10) contained only the measure of delay of gratification and a given outcome measure. Estimates shown in the second column of each set (i.e., Columns 2, 5, 8, and 11) added child background characteristics, Home Observation for Measurement of the Environment (HOME) scores, and site dummy variables. Estimates shown in the third column of each set (i.e., Columns 3, 6, 9, and 12) added other behavioral and cognitive measures also measured at age 54 months. Post hoc chi-square tests were used to generate p values in order to assess whether respective sets of variables were different from one another ( p s below .001 have been rounded to .001). Estimates in this table can be directly compared with estimates from Table 4 . The sample was limited to children whose mothers had completed at least 16 years of education (i.e., completed college).
Despite statistically nonsignificant results, point estimates were sometimes positive and substantial (e.g., the 2–7 min group coefficient shown in Column 1; β = 0.40, SE = 0.21, p = .054), but the standard errors were nearly double those estimated for children of nondegreed mothers ( Table 4 ). This is due in part to the somewhat smaller sample size for the higher-SES sample but also to the lack of variation in the delay-of-gratification measure for this sample. Thus, although we found even less evidence of associations between delay of gratification and measures of later achievement when considering only the children of mothers with college degrees, it is difficult to draw strong conclusions from these models given the imprecise nature of their coefficient estimates.
Heterogeneity.
Because we found little evidence supporting associations between early delay ability and later outcomes for the higher-SES sample, we next tested whether the different pattern of results observed between the higher- and lower-SES samples constituted a statistically significant difference. In Table 6 , we present models that included interaction terms between the various measures of delay of gratification (i.e., the continuous and categorical measures) and the indicator for whether the participant’s mother completed college. None of the interactions tested were statistically significant, and our series of joint F tests indicated that the set of interactions for the categorical measures of delay of gratification did not statistically significantly contribute to any of the models ( p s = .342–.968). However, as with the models that were run solely on the sample of children with college-educated mothers, standard errors were quite large for the interaction terms, indicating a substantial level of statistical imprecision. Unfortunately, the wide confidence intervals on many of the interaction terms render it impossible to provide a definitive answer to whether the relation between early delay ability and later achievement differs by SES.
Associations Between Delay of Gratification at Age 54 Months and Later Measures of Academic Achievement With Interactions Between Delay of Gratification and Socioeconomic Status
Variable | Achievement composite | Behavior composite | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Grade 1 | Age 15 | Grade 1 | Age 15 | |||||||||
(1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | (11) | (12) | |
Delay minutes (continuous) | 0.279 (0.038) | 0.115 (0.035) | 0.050 (0.030) | 0.236 (0.040) | 0.083 (0.037) | 0.040 (0.034) | −0.059 (0.042) | −0.019 (0.043) | 0.012 (0.033) | −0.062 (0.044) | −0.023 (0.046) | 0.009 (0.040) |
High-SES indicator | 0.509 (0.064) | 0.050 (0.068) | 0.032 (0.059) | 0.747 (0.067) | 0.270 (0.071) | 0.266 (0.066) | −0.187 (0.070) | 0.026 (0.084) | 0.031 (0.064) | −0.286 (0.074) | −0.119 (0.088) | −0.127 (0.077) |
Interaction | −0.101 (0.067) | −0.043 (0.058) | −0.035 (0.050) | −0.069 (0.069) | −0.007 (0.061) | −0.018 (0.057) | 0.010 (0.073) | −0.038 (0.071) | −0.058 (0.054) | 0.094 (0.076) | 0.040 (0.075) | 0.017 (0.066) |
Delay minutes (categorical) | ||||||||||||
< 0.333 min | ref | ref | ref | ref | ref | ref | ref | ref | ref | ref | ref | ref |
0.333–2 min | 0.298 (0.127) | 0.182 (0.110) | 0.109 (0.096) | 0.353 (0.131) | 0.202 (0.115) | 0.151 (0.107) | 0.055 (0.140) | 0.060 (0.137) | 0.050 (0.104) | −0.140 (0.148) | −0.082 (0.145) | −0.097 (0.127) |
2–7 min | 0.424 (0.127) | 0.215 (0.110) | 0.053 (0.097) | 0.457 (0.132) | 0.288 (0.115) | 0.199 (0.108) | −0.088 (0.140) | −0.046 (0.137) | 0.006 (0.105) | −0.182 (0.146) | −0.103 (0.143) | −0.024 (0.126) |
7 min | 0.721 (0.099) | 0.308 (0.090) | 0.147 (0.079) | 0.646 (0.105) | 0.222 (0.097) | 0.121 (0.091) | −0.121 (0.109) | −0.025 (0.112) | 0.034 (0.086) | −0.193 (0.116) | −0.087 (0.120) | −0.028 (0.106) |
High-SES indicator | 0.585 (0.174) | 0.154 (0.156) | 0.041 (0.136) | 0.951 (0.178) | 0.428 (0.163) | 0.417 (0.151) | −0.097 (0.187) | 0.163 (0.190) | 0.191 (0.144) | −0.375 (0.195) | −0.185 (0.199) | −0.138 (0.174) |
Interactions | ||||||||||||
High SES × < 0.333 min | 0.029 (0.252) | −0.164 (0.218) | 0.032 (0.190) | −0.274 (0.259) | −0.337 (0.226) | −0.266 (0.210) | −0.124 (0.275) | −0.127 (0.269) | −0.160 (0.205) | 0.075 (0.284) | 0.119 (0.276) | 0.035 (0.243) |
High SES × 2–7 min | −0.027 (0.240) | −0.138 (0.206) | 0.010 (0.179) | −0.241 (0.246) | −0.293 (0.213) | −0.258 (0.198) | −0.188 (0.260) | −0.185 (0.252) | −0.199 (0.192) | −0.136 (0.272) | −0.090 (0.261) | −0.156 (0.229) |
High SES × 7 min | −0.159 (0.192) | −0.119 (0.165) | −0.033 (0.144) | −0.242 (0.197) | −0.119 (0.173) | −0.134 (0.161) | −0.073 (0.207) | −0.167 (0.201) | −0.203 (0.153) | 0.186 (0.217) | 0.115 (0.212) | 0.049 (0.186) |
value from interaction-term joint test | .668 | .870 | .968 | .640 | .342 | .507 | .899 | .859 | .610 | .450 | .753 | .720 |
Control variables included | ||||||||||||
Child background and HOME | No | Yes | Yes | No | Yes | Yes | No | Yes | Yes | No | Yes | Yes |
Concurrent 54 month | No | No | Yes | No | No | Yes | No | No | Yes | No | No | Yes |
Note: n = 918. For the continuous and categorical measures of delay minutes, the table gives standardized coefficients (with standard errors in parentheses). For the categorical measure, < 0.333 min was the reference category. Because continuous variables were standardized, coefficients can be interpreted as effect sizes. Estimates shown in the first column of each set (i.e., Columns 1, 4, 7, and 10) contained only the measure of delay of gratification and a given outcome measure. Estimates shown in the second column of each set (i.e., Columns 2, 5, 8, and 11) added child background characteristics, Home Observation for Measurement of the Environment (HOME) scores, and site dummy variables. Estimates shown in the third column of each set (i.e., Columns 3, 6, 9, and 12) added other behavioral and cognitive measures also measured at age 54 months. Post hoc chi-square tests were used to generate p values in order to assess whether respective sets of variables were different from one another ( p s below .001 have been rounded to .001). The joint F test evaluated whether the set of interaction terms jointly contributed to the model. In other words, it tested whether the set of interactions were statistically significantly different from zero.
In Table 7 , we present correlations between the marshmallow test and all analysis variables for the full sample of children considered in our analyses ( n = 918; see the Supplemental Material for correlation matrices for both the lower-SES and higher-SES samples, respectively). In Table 7 , we also included the 54-month measure of the Continuous Performance Task (CPT; Barkley, 1994 ), which is a commonly used indicator of attention and impulsivity, and we included the Duckworth et al. (2013) parent- and teacher-report index of 54-month self-control (see the Supplemental Material for measurement details). We included these additional measures to further investigate how the marshmallow test might relate to theoretically relevant constructs (see Diamond & Lee, 2011 ). Surprisingly, the marshmallow test had the strongest correlation with the Applied Problems subtest of the WJ-R, r (916) = .37, p < .001; and correlations with measures of attention, impulsivity, and self-control were lower in magnitude ( r s = .22–.30, p < .001). Although these correlational results were far from conclusive, they suggest that the marshmallow test should not be thought of as a mere behavioral proxy for self-control, as the measure clearly relates strongly to basic measures of cognitive capacity.
Correlations Between All Analysis Variables
Variable | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Gratification delay (54 months) | |||||||||||||||||||||||||||||||||||||||||||||||||
1. Continuous | — | ||||||||||||||||||||||||||||||||||||||||||||||||
2. < 0.333 min | −.69 | — | |||||||||||||||||||||||||||||||||||||||||||||||
3. 0.333–2 min | −.47 | −.18 | — | ||||||||||||||||||||||||||||||||||||||||||||||
4. 2–7 min | −.07 | −.19 | −.16 | — | |||||||||||||||||||||||||||||||||||||||||||||
5. 7 min | .90 | −.51 | −.43 | −.45 | — | ||||||||||||||||||||||||||||||||||||||||||||
Related measures | |||||||||||||||||||||||||||||||||||||||||||||||||
6. Self–control (54 months) | .24 | −.15 | −.15 | −.03 | .24 | — | |||||||||||||||||||||||||||||||||||||||||||
7. Attention (54 months) | .22 | −.18 | −.07 | −.08 | .24 | .15 | — | ||||||||||||||||||||||||||||||||||||||||||
8. Impulsivity (54 months) | −.30 | .26 | .06 | .05 | −.28 | −.28 | −.26 | — | |||||||||||||||||||||||||||||||||||||||||
Outcome measures | |||||||||||||||||||||||||||||||||||||||||||||||||
9. Achievement (Grade 1) | .31 | −.26 | −.08 | −.03 | .28 | .33 | .30 | −.27 | — | ||||||||||||||||||||||||||||||||||||||||
10. Achievement (age 15) | .30 | −.25 | −.09 | −.02 | .27 | .32 | .20 | −.23 | .64 | — | |||||||||||||||||||||||||||||||||||||||
11. Behavior (Grade 1) | −.08 | .06 | .05 | −.02 | −.07 | −.30 | −.08 | .05 | −.09 | −.11 | — | ||||||||||||||||||||||||||||||||||||||
12. Behavior (age 15) | −.06 | .08 | .01 | −.04 | −.04 | −.23 | −.06 | .06 | −.11 | −.13 | .55 | — | |||||||||||||||||||||||||||||||||||||
Background controls | |||||||||||||||||||||||||||||||||||||||||||||||||
13. Male | −.05 | .06 | −.02 | .02 | −.05 | −.20 | −.01 | .23 | −.01 | .05 | −.00 | −.04 | — | ||||||||||||||||||||||||||||||||||||
14. Black | −.25 | .21 | .07 | .05 | −.24 | −.16 | −.12 | .20 | −.29 | −.33 | .06 | .00 | −.00 | — | |||||||||||||||||||||||||||||||||||
15. Hispanic | −.03 | −.00 | .06 | −.02 | −.02 | −.04 | −.02 | .03 | −.05 | −.03 | .01 | .04 | .03 | −.08 | — | ||||||||||||||||||||||||||||||||||
16. Other | −.04 | .00 | .03 | .04 | −.05 | −.00 | .02 | −.02 | .02 | .01 | −.01 | −.01 | −.03 | −.07 | −.05 | — | |||||||||||||||||||||||||||||||||
17. Age | .03 | −.04 | .03 | −.02 | .02 | .03 | .06 | −.02 | .04 | −.05 | .03 | .04 | −.00 | .03 | .01 | −.04 | — | ||||||||||||||||||||||||||||||||
18. Log of income | .30 | −.26 | −.08 | −.03 | .27 | .26 | .19 | −.19 | .37 | .40 | −.16 | −.17 | −.02 | −.36 | −.08 | −.01 | −.01 | — | |||||||||||||||||||||||||||||||
19. Mother’s age | .20 | −.18 | −.05 | −.00 | .18 | .18 | .12 | −.14 | .22 | .32 | −.18 | −.21 | −.04 | −.28 | −.10 | −.04 | −.04 | .54 | — | ||||||||||||||||||||||||||||||
20. Mother’s education (years) | .25 | −.19 | −.09 | −.04 | .24 | .27 | .16 | −.20 | .35 | .42 | −.13 | −.17 | −.04 | −.22 | −.11 | −.03 | −.00 | .61 | .52 | — | |||||||||||||||||||||||||||||
21. Mother’s PPVT score | .28 | −.22 | −.09 | −.08 | .28 | .29 | .12 | −.18 | .35 | .48 | −.10 | −.07 | −.01 | −.37 | −.11 | −.09 | −.03 | .49 | .46 | .57 | — | ||||||||||||||||||||||||||||
22. Site 1 | −.04 | .02 | .00 | .06 | −.06 | −.06 | .06 | −.02 | .03 | −.14 | .09 | .07 | −.00 | .11 | −.07 | −.07 | −.03 | −.09 | −.09 | −.06 | −.10 | — | |||||||||||||||||||||||||||
23. Site 2 | .00 | −.06 | .05 | .01 | .00 | .04 | .03 | −.03 | .06 | .10 | −.06 | −.07 | .00 | −.11 | .23 | −.01 | −.18 | .16 | .06 | .02 | .04 | −.10 | — | ||||||||||||||||||||||||||
24. Site 3 | .07 | −.05 | −.03 | −.02 | .07 | −.04 | .02 | −.09 | −.04 | −.08 | .04 | .02 | −.02 | −.02 | .08 | −.03 | .10 | −.07 | −.05 | .02 | −.02 | −.10 | −.11 | — | |||||||||||||||||||||||||
25. Site 4 | −.00 | .02 | −.01 | −.01 | −.00 | .02 | .04 | .09 | −.02 | .05 | −.07 | −.04 | .02 | −.05 | −.04 | .04 | .00 | .05 | .07 | −.04 | −.02 | −.10 | −.11 | −.11 | — | ||||||||||||||||||||||||
26. Site 5 | −.06 | .02 | .06 | −.00 | −.06 | .02 | .03 | .01 | −.02 | −.05 | −.02 | −.02 | .02 | .13 | −.06 | −.02 | .22 | −.05 | .03 | .02 | −.01 | −.11 | −.11 | −.11 | −.11 | — | |||||||||||||||||||||||
27. Site 6 | .03 | −.01 | −.04 | −.01 | .04 | .06 | .04 | −.03 | .04 | .09 | −.07 | −.08 | −.02 | .11 | −.06 | −.01 | −.10 | .12 | .09 | .13 | .10 | −.10 | −.10 | −.11 | −.10 | −.11 | — | ||||||||||||||||||||||
28. Site 7 | −.05 | .04 | .00 | .01 | −.04 | −.02 | −.10 | .12 | −.05 | .02 | −.02 | −.03 | .02 | .01 | .00 | .02 | .14 | −.09 | −.07 | −.08 | −.01 | −.11 | −.11 | −.11 | −.11 | −.12 | −.11 | — | |||||||||||||||||||||
29. Site 8 | .06 | .00 | −.05 | −.09 | .09 | .10 | −.00 | −.08 | −.01 | .05 | −.02 | .05 | −.00 | −.05 | .00 | .11 | −.19 | .08 | .10 | .10 | .14 | −.11 | −.11 | −.11 | −.11 | −.12 | −.11 | −.12 | — | ||||||||||||||||||||
30. Site 9 | −.04 | −.00 | .04 | .04 | −.06 | −.07 | −.01 | .00 | .05 | .02 | .03 | .03 | .01 | −.05 | −.06 | −.05 | .04 | −.06 | −.08 | −.12 | −.11 | −.11 | −.11 | −.11 | −.11 | −.12 | −.11 | −.12 | −.12 | — | |||||||||||||||||||
31. Birth weight (g) | −.01 | .02 | .01 | −.06 | .02 | −.02 | .05 | −.01 | .11 | .10 | .02 | .09 | .12 | −.14 | .04 | −.07 | .01 | .04 | .05 | .07 | .13 | −.02 | .03 | −.06 | .04 | −.02 | −.03 | −.01 | .03 | .02 | — | ||||||||||||||||||
32. BBCS | .28 | −.22 | −.10 | −.04 | .26 | .32 | .26 | −.29 | .54 | .50 | −.09 | −.10 | −.15 | −.32 | −.07 | −.02 | .01 | .45 | .32 | .42 | .40 | −.05 | .09 | .00 | .04 | .00 | .05 | −.14 | .05 | −.03 | .08 | — | |||||||||||||||||
33. Bayley MDI | .34 | −.27 | −.08 | −.06 | .31 | .29 | .24 | −.24 | .42 | .39 | −.08 | −.13 | −.17 | −.32 | −.08 | −.01 | −.02 | .40 | .23 | .36 | .34 | −.05 | .10 | −.02 | .11 | .02 | .02 | −.18 | −.06 | −.01 | .06 | .52 | — | ||||||||||||||||
34. Temperament | −.08 | .11 | .00 | −.02 | −.06 | −.14 | −.04 | .08 | −.11 | −.12 | .12 | .15 | −.04 | .17 | −.01 | .05 | −.01 | −.19 | −.19 | −.13 | −.19 | .06 | −.08 | −.01 | −.01 | −.04 | .02 | .02 | .02 | .03 | −.04 | −.15 | −.12 | — | |||||||||||||||
HOME controls | |||||||||||||||||||||||||||||||||||||||||||||||||
35. Learning Materials | .29 | −.23 | −.11 | −.02 | .27 | .31 | .15 | −.23 | .38 | .40 | −.10 | −.11 | −.05 | −.39 | −.12 | −.08 | .03 | .49 | .35 | .48 | .47 | −.06 | −.09 | −.01 | −.04 | −.01 | .08 | −.04 | .00 | .05 | .07 | .47 | .43 | −.12 | — | ||||||||||||||
36. Language Stimulation | .21 | −.18 | −.05 | −.04 | .20 | .17 | .08 | −.14 | .25 | .21 | −.01 | −.06 | −.03 | −.11 | −.12 | −.11 | .01 | .27 | .12 | .24 | .24 | .06 | −.16 | −.15 | −.20 | .02 | .18 | .01 | .05 | .09 | .10 | .28 | .23 | −.04 | .51 | — | |||||||||||||
37. Physical Environment | .20 | −.13 | −.13 | .02 | .17 | .15 | .13 | −.12 | .23 | .21 | −.09 | −.08 | .01 | −.24 | .00 | .01 | −.03 | .28 | .18 | .23 | .19 | −.07 | −.06 | −.05 | .03 | .09 | .02 | −.25 | −.05 | .19 | .01 | .25 | .24 | −.08 | .41 | .28 | — | ||||||||||||
38. Responsivity | .19 | −.13 | −.08 | −.05 | .20 | .18 | .14 | −.12 | .19 | .17 | −.09 | −.07 | −.02 | −.22 | −.06 | −.07 | −.09 | .32 | .26 | .28 | .27 | −.11 | −.01 | −.06 | −.02 | −.11 | .30 | −.30 | .12 | .06 | .08 | .31 | .25 | −.11 | .38 | .38 | .26 | — | |||||||||||
39. Academic Stimulation | .21 | −.17 | −.06 | −.01 | .18 | .15 | .05 | −.15 | .23 | .20 | .00 | −.01 | −.03 | −.17 | −.09 | −.04 | .01 | .24 | .12 | .25 | .24 | −.01 | −.18 | −.06 | −.07 | .04 | .12 | −.11 | .05 | .09 | .08 | .33 | .26 | −.02 | .55 | .55 | .31 | .33 | — | ||||||||||
40. Modeling | .17 | −.11 | −.06 | −.05 | .16 | .17 | .10 | −.07 | .23 | .25 | −.07 | −.04 | −.05 | −.15 | −.06 | −.06 | −.05 | .31 | .23 | .33 | .29 | −.00 | .01 | −.09 | −.08 | −.00 | .15 | −.08 | .03 | −.06 | .07 | .24 | .24 | −.10 | .37 | .31 | .26 | .28 | .28 | — | |||||||||
41. Variety | .25 | −.15 | −.14 | −.04 | .24 | .22 | .12 | −.21 | .28 | .29 | −.10 | −.07 | −.03 | −.27 | −.09 | −.05 | .01 | .41 | .27 | .39 | .37 | .02 | −.14 | −.06 | −.04 | −.03 | .16 | −.09 | .07 | .08 | .04 | .36 | .37 | −.09 | .56 | .41 | .33 | .33 | .43 | .35 | — | ||||||||
42. Acceptance | .12 | −.07 | −.07 | −.04 | .13 | .21 | .13 | −.17 | .16 | .19 | −.16 | −.14 | −.05 | −.10 | −.01 | .00 | −.04 | .23 | .21 | .24 | .20 | −.10 | −.00 | −.06 | .01 | −.04 | .14 | .04 | .04 | −.14 | .05 | .22 | .19 | −.05 | .28 | .20 | .17 | .19 | .14 | .32 | .23 | — | |||||||
43. Responsivity-Empirical Scale | .20 | −.14 | −.08 | −.05 | .20 | .16 | .12 | −.10 | .20 | .16 | −.06 | −.05 | −.02 | −.18 | −.06 | −.04 | −.06 | .31 | .20 | .26 | .25 | .04 | −.03 | −.07 | −.03 | −.15 | .17 | −.12 | .08 | .02 | .04 | .24 | .19 | −.12 | .35 | .48 | .26 | .77 | .29 | .27 | .28 | .23 | — | ||||||
54-month controls | |||||||||||||||||||||||||||||||||||||||||||||||||
44. Letter-Word ID (WJ-R) | .28 | −.22 | −.09 | −.03 | .25 | .29 | .25 | −.24 | .60 | .49 | −.07 | −.07 | −.10 | −.20 | −.08 | .05 | −.01 | .38 | .19 | .38 | .34 | −.02 | .01 | −.01 | .03 | −.01 | .10 | −.08 | .03 | −.05 | .07 | .61 | .40 | −.08 | .40 | .29 | .23 | .26 | .34 | .23 | .32 | .19 | .21 | — | |||||
45. Applied Problems (WJ-R) | .37 | −.28 | −.16 | −.01 | .33 | .35 | .32 | −.32 | .62 | .56 | −.04 | −.12 | −.10 | −.32 | −.07 | .01 | −.02 | .42 | .28 | .40 | .43 | −.08 | .03 | .01 | .07 | −.02 | .09 | −.08 | .04 | −.03 | .09 | .57 | .56 | −.13 | .43 | .24 | .27 | .25 | .25 | .18 | .31 | .22 | .21 | .58 | — | ||||
46. Picture Vocabulary (WJ-R) | .28 | −.21 | −.08 | −.09 | .28 | .25 | .22 | −.18 | .42 | .50 | −.09 | −.04 | .10 | −.33 | −.10 | −.01 | .02 | .42 | .32 | .40 | .48 | −.12 | .01 | .01 | .07 | −.01 | .05 | −.04 | .10 | −.03 | .12 | .46 | .44 | −.12 | .43 | .25 | .23 | .27 | .28 | .21 | .38 | .16 | .22 | .46 | .52 | — | |||
47. Memory for Sentences (WJ-R) | .29 | −.25 | −.09 | −.02 | .26 | .28 | .22 | −.21 | .42 | .43 | −.11 | −.09 | −.04 | −.18 | −.11 | .04 | .02 | .29 | .21 | .28 | .30 | −.08 | −.06 | −.06 | .07 | .08 | .06 | −.05 | .04 | .03 | .08 | .39 | .43 | −.08 | .31 | .20 | .19 | .17 | .22 | .14 | .30 | .15 | .13 | .42 | .47 | .46 | — | ||
48. Incomplete Words (WJ-R) | .23 | −.17 | −.08 | −.06 | .22 | .15 | .19 | −.17 | .39 | .34 | −.05 | −.12 | −.00 | −.18 | −.10 | .01 | .03 | .24 | .18 | .24 | .27 | −.07 | −.06 | −.04 | .02 | .05 | .04 | .00 | −.05 | .13 | .09 | .30 | .36 | −.10 | .30 | .24 | .23 | .16 | .22 | .14 | .27 | .11 | .16 | .36 | .45 | .37 | .49 | — | |
49. Internalizing (CBCL) | −.04 | .04 | .02 | −.00 | −.04 | −.17 | −.05 | .08 | −.07 | −.08 | .53 | .38 | .03 | .04 | −.01 | .03 | .09 | −.06 | −.09 | −.10 | −.10 | .01 | −.04 | −.02 | .03 | .05 | −.05 | .01 | −.02 | −.03 | .04 | −.03 | −.03 | .14 | −.07 | −.04 | −.02 | −.04 | .02 | −.04 | −.07 | −.07 | −.06 | −.01 | −.04 | −.08 | −.06 | −.04 | — |
50. Externalizing (CBCL) | −.10 | .07 | .07 | −.02 | −.09 | −.39 | −.07 | .09 | −.10 | −.12 | .63 | .47 | −.08 | .05 | .01 | −.00 | .01 | −.12 | −.13 | −.13 | −.11 | .04 | .01 | .03 | −.03 | .01 | −.03 | −.01 | −.06 | .01 | .04 | −.11 | −.05 | .12 | −.12 | −.06 | −.09 | −.10 | −.07 | −.11 | −.11 | −.14 | −.11 | −.04 | −.05 | −.09 | −.10 | −.05 | .58 |
Note: n = 918. All nonmissing cases for each pairwise correlation were included. The Supplemental Material presents correlations for all variables shown separately by mother’s education. BBCS = Bracken Basic Concept Scale; CBCL = Child Behavior Checklist; HOME = Home Observation for Measurement of the Environment; MDI = Mental Development Index; PPVT = Peabody Picture Vocabulary Test; WJ-R = Woodcock-Johnson Psycho-Educational Battery Revised.
In the Supplemental Material , we report further assessments of the extent to which self-control and attention could account for the associations between delay of gratification and later achievement. In Table S3, we included the 54-months measures of attention and impulse control taken from the CPT in the Table 4 models and found that inclusion of the CPT measures accounted for only 21% to 27% of the effect for the less-than-7-min group. In Table S4, we present results from a parallel analysis using the Duckworth et al. (2013) index of self-control, and again we found that coefficients were hardly reduced when the self-control index was included. The small change in the coefficient for the delay-of-gratification measure between models that did and did not include indicators of attention, impulsivity, and self-control raises further questions regarding what constructs are measured by the marshmallow test.
Returning to our focal sample of children with mothers who had not completed college, we were surprised to see the lack of significant associations between our delay-of-gratification measure and the behavioral measures at Grade 1 and age 15. We also tested models that used alternative indicators of behavior assessed at age 15, including measures of risky behavior from youth self-reports and assessments of impulse control. Surprisingly, we still found virtually no associations between delay of gratification and behavior across any of these alternative measures (Tables S5–S7 in the Supplemental Material ). Furthermore, because we relied on aggregated measures of achievement and behavior, we also tested separate models for math, reading, externalizing behaviors, and internalizing behaviors (Table S8 in the Supplemental Material ). Results indicated that the achievement associations were similar for both the math and reading measures, and we still found no statistically significant effects on either measure of problem behaviors.
We attempted to extend the famous findings of Mischel and Shoda ( Mischel et al., 1988 ; Mischel et al., 1989 ; Shoda et al., 1990 ) by examining associations between early delay of gratification and adolescent outcomes in a more diverse sample of children and with more sophisticated statistical models. As with the earlier studies, we found statistically significant, although smaller, bivariate associations between early delay ability and later achievement. But we also found that these associations were highly sensitive to the inclusion of controls. Moreover, we failed to find even bivariate associations between delay of gratification at age 54 months and a host of behavioral outcomes at age 15, which was remarkable given the stability in self-control measures found in other studies (e.g., Moffitt et al., 2011 ).
It surprised us that for the children of nondegreed mothers, most of the achievement boost for early delay ability was gained by waiting a mere 20 s. Shoda et al. (1990) argued that the relationship between delay of gratification and academic achievement might be driven by the ability to generate useful metacognitive strategies that will influence self-regulation throughout one’s life. Such strategies are unlikely to have played much of a role in a child’s ability to wait for only 20 s. Instead, our findings suggest that impulse control may be a key mechanism, although post hoc inclusion of an explicit measure of impulse control explained some but certainly not most of the delay-of-gratification effect.
These results create further questions regarding what the marshmallow test might measure and how it relates to the umbrella construct of self-control. We observed that delay of gratification was strongly correlated with concurrent measures of cognitive ability, and controlling for a composite measure of self-control explained only about 25% of our reported effects on achievement. These results suggest that the marshmallow test may capture something rather distinct from self-control. Indeed, Duckworth and colleagues (2013) also investigated the relations among delay of gratification, self-control, and intelligence using the data employed here, and they found that both self-control and intelligence mediated the relation between early delay ability and later outcomes. Our results further suggest that simply viewing delay of gratification as a component of self-control may oversimplify how it operates in young children.
When considering how our results might inform intervention development, recall that models with controls for concurrent measures of cognitive skills and behavior reduced the association between delay of gratification and age-15 achievement to nearly zero. This implies that an intervention that altered a child’s ability to delay but failed to change more general cognitive and behavioral capacities would likely have limited effects on later outcomes. If intervention developers hope to generate program impacts that replicate the long-term marshmallow test findings, targeting the broader cognitive and behavioral abilities related to delay of gratification might prove more fruitful.
Indeed, Mischel and Shoda’s original results ( Shoda et al., 1990 ) supported similar conclusions. Recall that they reported long-run correlations between delay of gratification and later outcomes only for children who were not provided with strategies for delaying longer. That the prediction was strong only in trials that relied on natural variation in children’s ability to delay suggests that unobserved factors underlying children’s delay ability may have driven the long-run correlations. Our results support this interpretation.
Our study is not without weaknesses. The 7-min ceiling was limiting, although our nonlinear models indicated that it was unlikely to affect conclusions drawn for the lower-SES sample. For the higher-SES sample, the 7-min ceiling prevented a direct replication of Mischel and Shoda’s original work (e.g., Shoda et al., 1990 ), as a substantial majority of higher-SES children hit the ceiling. The lack of precision in our higher-SES results was unfortunate, though it should be noted that point estimates in fully controlled models were often very small. At the very least, these results further suggest that bivariate associations between delay of gratification and later outcomes probably contain substantial bias, even for more privileged children.
It should also be noted that variation in our delay-of-gratification measure at age 54 months was not exogenous, so our models could not truly capture the effects that would be produced by exogenously spurred gains in early delay-of-gratification ability. However, our models included an extensive set of control variables that go well beyond the bivariate specifications employed in previous studies (e.g., Shoda et al., 1990 ). Finally, data not drawn to be nationally representative provide a shaky foundation for generalization.
In sum, our findings suggest that although early delay of gratification did indeed correlate with later achievement for children whose mothers had not completed college, the magnitude of this association was highly sensitive to the inclusion of control variables and did not appear to be linear across the delay-of-gratification distribution. Future work on delay of gratification should continue to examine the processes captured by the marshmallow test and whether early delay-of-gratification interventions would be worthwhile investments for promoting children’s long-run success.
Acknowledgments.
We are grateful to Ana Auger, Drew Bailey, Daniel Belsky, Jay Belsky, Clancy Blair, Peg Burchinal, Angela Duckworth, Dorothy Duncan, Jade Jenkins, Terrie Moffitt, Cybele Raver, and Deborah Vandell for helpful comments on drafts of this manuscript.
Action Editor: Brent W. Roberts served as action editor for this article.
Author Contributions: T. W. Watts and G. J. Duncan developed the study concept and design and wrote the manuscript. T. W. Watts and H. Quan analyzed the data. All authors approved the final manuscript.
Declaration of Conflicting Interests: The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.
Funding: This research was supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under award number P01-HD065704. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Supplemental Material: Additional supporting information can be found at http://journals.sagepub.com/doi/suppl/10.1177/0956797618761661
The primary data set used in this study was from the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development. Our data-use agreement prevented us from posting these data online, but the data set is available on request from the ICPSR website ( https://www.icpsr.umich.edu/icpsrweb/ICPSR/series/00233 ). The secondary data set was from the Early Childhood Longitudinal Program and can be found at the National Center for Education Statistics website ( https://nces.ed.gov/ecls/dataproducts.asp ). The three Stata files necessary to replicate the results given here in the tables, along with the complete Open Practices Disclosure for this article, can be found in the Supplemental Material ( http://journals.sagepub.com/doi/suppl/10.1177/0956797618761661 ).
The design and analysis plans for the study were not preregistered. This article has received the badge for Open Data. More information about the Open Practices badges can be found at http://www.psychologicalscience.org/publications/badges .
by Ed Batista
Originally conducted by psychologist Walter Mischel in the late 1960s, the Stanford marshmallow test has become a touchstone of developmental psychology. Children at Stanford’s Bing Nursery School, aged four to six, were placed in a room furnished only with a table and chair. A single treat, selected by the child, was placed on the table. (In addition to marshmallows, the researchers also offered Oreo cookies and pretzel sticks.) Each child was told if they waited for 15 minutes before eating the treat, they would be given a second treat. Then they were left alone in the room.
By Janine Zacharia, Journalist and Bing Parent
Walter Mischel’s pioneering research at Bing in the late 1960s and early 1970s famously explored what enabled preschool-aged children to forgo immediate gratification in exchange for a larger but delayed reward.
Resisting temptation, Mischel noted in a speech to several hundred Bing parents, is a problem that goes back to the story of Adam and Eve and the apple, and to Ulysses, who “tied himself to the mast to resist his temptations.” But until Mischel’s research at Bing, it was bypassed in modern science. Mischel, now a psychology professor at Columbia University, spoke at Stanford’s CEMEX Auditorium on Nov. 19, 2014.
The deliberately simple method Mischel devised to study willpower became known in popular culture as the “Marshmallow Test.” Mischel began by observing how those Bing children who could wait distracted themselves to avoid the temptations and used their imaginations to keep on waiting for their chosen goal. Some children turned their backs to the treats, or covered up their eyes so they couldn’t see them, or sang quietly to themselves (“Oh this is my home in Redwood City”). Others played with their toes as if they were piano keys, explored their nasal and ear cavities, or invented songs and games to amuse themselves to make the delay easier. And some sat quietly while giving themselves whispered self-instructions, repeating the contingency: “If I wait, then I get both; but if I don’t, then I just get one.”
This research identified some of the key cognitive skills, strategies, plans and mindsets that enable self-control. If the children focused on the “hot” qualities of the temptations (e.g., “The marshmallows are sweet, chewy, yummy”), they soon rang the bell to bring the researcher back. If they focused on their abstract “cool” features (“The marshmallows are puffy and round like cotton balls”), they managed to wait longer than the researchers, watching them through a one-way observation window, could bear. And when they imagined that the treats facing them were “just a picture” and were cued to “put a frame around it in your head” they were able to wait for almost 18 minutes. When Mischel asked a child how she managed to wait so long, she replied: “well you can’t eat a picture.”
These studies demystified willpower and showed how self-control and emotion regulation could be enhanced, taught and learned, beginning very early in life, even by children who initially had much difficulty delaying gratification.
The Bing research also yielded a surprise: What the preschoolers did as they tried to wait, unexpectedly predicted much about their future lives. “The more seconds they waited at age 4 or 5, the higher their SAT scores and the better their rated social and cognitive function in adolescence,” Mischel writes in his recent book, The Marshmallow Test: Mastering Self-Control .
Children who waited longer tended to become more self-reliant, more self-confident, less distractible and more able to cope with stress as adolescents, he said. But, he added, he has reassured anxious parents over the years that a child’s ability to delay gratification in preschool does not determine their future. “Clearly, your future is not in a marshmallow,” he said, debunking the pithy but incorrect way popular media have summed up his findings.
It’s a terrible mistake to think that if a child can’t wait 15 minutes, the child has serious problems, Mischel said in his talk. “If the child is waiting 15 minutes, it is telling you something important: Now you know she is able to wait effectively for something when she really wants it.” But if she doesn’t wait it may mean she just didn’t think it was worth waiting for.
Most important, the research over the years by Mischel and many others has helped to clarify the mental and brain mechanisms that underlie self-control, and stimulated decades of research on “executive function” and self-regulation. Many educators and parents use the findings to help children learn self-control. As the public interest in willpower has increased, so has the research on how self-control works. In one follow-up study, published in 2011, Bing participants returned to Stanford 40 years later so that the researchers could examine aspects of their brain activity and how they relate to their self-control earlier in life.
In his recent book, Mischel gave an example from his own life. Fifty years ago he was a three-pack-a-day smoker who once caught himself in the shower smoking a pipe. He knew he needed to stop. But it wasn’t until Mischel saw a patient at Stanford Hospital being wheeled on a gurney, his head and chest shaved smooth, little paint marks to show where the radiation should go to treat him for metastasized lung cancer, that he managed to quit. Imagining himself on that gurney helped him change his habit. Seeing and vividly remembering that cancer patient each time he was tempted to smoke made those potential future consequences immediate and powerful. He called this “pre-living a delayed outcome” so that it is experienced in the here and now and not discounted because it’s far off.
In his acknowledgments, he expresses gratitude “to the children and families whose contributions and unstinting cooperation, often over the course of many years” made the Bing research possible. Two of Mischel’s graduate students, Yuichi Shoda and Philip K. Peake, have continued to work with him for more than three decades on the research begun at Bing. Mischel, Shoda and Peake, will be honored on Sept. 17 at the Library of Congress with the “2015 Golden Goose Award,” given for federally funded long-term basic research that turns out, often unexpectedly, to have important applications for human welfare.
Let’s say you are looking at a marshmallow. You have two options. You can either wait for the first marshmallow and get a second marshmallow. Or, you can have one marshmallow now - but then you’re done.
How long would you wait for the second marshmallow? Would you wait for the second marshmallow?
This was the dilemma facing preschool-age children in the 1960s. Little did they know that the way they handled the dilemma would be part of one of the most famous psychological studies of all time. In this video, I’m going to tell you all about the Stanford Marshmallow Test: what it was, what it’s said about success, and the impact it’s had on psychology to this day.
The premise of the test was simple. Stanford professor Walter Mischel and his team put a single marshmallow in front of a child, usually 4 or 5 years old. They told the child that they would leave the room and come back in a few minutes. If the child ate the marshmallow, they would not get a second. If the child waited until the researcher was back in the room, the child would get a second marshmallow.
Researchers recorded which children ate the marshmallow and which one waited. And then the researchers waited. When the children were teenagers, the researchers revisited the children and asked their parents a series of questions about their cognitive abilities, how they handled stress, and their ability to exhibit self-control under pressure. They also looked at the child’s SAT scores. A few years later, the researchers tested the participants again on their self-control.
What did they find? In short, Mischel and his team found that developing self-control as a child had a profound impact on the child’s later success in life. Success came in many forms. In general, the children who waited for the second marshmallow:
Due to the nature of the experiment, the results were published in the 1980s and the 1990s. Since then, the world of psychology has regarded the study as one of the important studies, paving the way for different ways of looking at how personality influences and predicts success.
The Marshmallow Test was able to give researchers a link between self-control and success. In short, having self-control as a child could influence success as an adult. But what influenced self-control? Not all children grabbed the marshmallow right away.
Mischel and his team developed a “hot-and-cool” system of thinking that explained why children would have eaten the marshmallow immediately. This same system could be applied to any task that involves instant gratification, like making a purchase or smoking a cigarette.
The “cool” system is where most of are when we’re not tempted. It’s the cognitive ability to think about long-term benefits. We know that smoking is bad for us, and resisting a cigarette will result in long-term health. We know that we will get more marshmallows if we wait. We know that if we go to the gym instead of hitting the snooze alarm, we will feel more awake later and more healthy in the long run.
But “hot” stimuli threaten that cool system. When things warm up and get hot, our behavior becomes impulsive. We smoke the cigarette, take the marshmallow, and hit the snooze.
Why do some people “heat up” faster than others? Why are some stimuli are “hotter” than others? These are questions that psychologists like Mischel are still trying to solve.
The Stanford Marshmallow Test took data from a relatively small and not exactly diverse group of participants. Not all researchers were convinced that the test had found the one true key to success. So a more recent study set out to redo the Marshmallow Test, focusing on different social and economic factors that could also play into a child’s success.
The main factor they chose was the mother’s educational background. They split participants into groups based on whether or not the mother had obtained a college degree. Researchers also controlled for factors like family background, early cognitive ability, and the child’s environment at home.
In short, they found that self-control didn’t exactly have the impact on success that the Marshmallow Experiment said it did. Children who came from more wealthy homes were more likely to practice self-control. When the researchers accounted for social and economic factors, they found that self-control wasn’t necessary in predicting success.
This doesn’t just put the results of self-control into question. It also questions why children grabbed the marshmallow in the first place. Researchers have offered different opinions, including thoughts on how scarcity impacts a child’s ability to take and use resources. Think about it. Wealthy kids don’t have too many problems waiting for food, toys, or other things. Their parents can afford it. They’ll get it. But kids from lower-income families have more to worry about. Food might be scarce. A parent may promise to bring their kids to a nice restaurant or buy them a nice toy, but may not be able to follow through. If something is right in front of you, you might as well take it - you can’t guarantee it will be there later, or that you’ll get a reward for waiting.
While recent studies have claimed to “debunk” the Marshmallow Test, it’s impossible to deny that the impact of the study. Mischel’s work was able to show the world how certain personality traits impacted a child’s chance at success. Further work has since been done on different personality traits and how they relate to success in business, love, etc. We might know the terms “growth mindset” or “emotional intelligence” if the Marshmallow Test didn’t exist.
The Marshmallow Test is not the only classic experiment that has recently undergone criticisms. Psychology is currently undergoing what is called a “replication crisis.” Replications of world-renowned experiments like The Marshmallow Test and the Stanford Prison Experiment show that these tests aren’t as solid or accurate as were once taught in schools. Like any type of study that involves the scientific method, psychology is always evolving and psychologists are continuing to tweak, change, or adjust theories that don’t hold up to modern tests.
Self-control does have an impact on behavior and possibly success, but it’s up to the current and future generations to learn more about just how self-control is influenced, and influences other personality traits and factors.
Reference this article:
PracticalPie.com is a participant in the Amazon Associates Program. As an Amazon Associate we earn from qualifying purchases.
Follow Us On:
Youtube Facebook Instagram X/Twitter
Developmental
Personality
Relationships
Psychologists
Serial Killers
Personality Quiz
Memory Test
Depression test
Type A/B Personality Test
© PracticalPsychology. All rights reserved
Privacy Policy | Terms of Use
Questionnaire
In a new book, psychologist Walter Mischel discusses how we can all become better at resisting temptation, and why doing so can improve our lives.
By Lea Winerman
Monitor Staff
December 2014, Vol 45, No. 11
Print version: page 28
The plot is funny, but it's based on serious science. In a series of studies that began in the late 1960s and continue today, psychologist Walter Mischel, PhD, found that children who, as 4-year-olds, could resist a tempting marshmallow placed in front of them, and instead hold out for a larger reward in the future (two marshmallows), became adults who were more likely to finish college and earn higher incomes, and were less likely to become overweight.
So what's the lesson to take from this? It's not that the marshmallow test is destiny and that preschoolers who fail it are doomed, Mischel says. Instead, the good news is that the strategies the successful preschoolers used can be taught to people of all ages. By harnessing the power of executive function and self-control strategies, we can all improve our ability to achieve our goals. Today, Mischel's lessons are being applied on Sesame Street and in inner-city charter schools, among other places.
Mischel talked to the Monitor about his decades of research and his new book, "The Marshmallow Test: Mastering Self-Control," that sums them up.
To me it's a matter of helping kids to have the freedom to make choices. Whether or not they choose to eat the marshmallow, if they know how to wait for it, is up to them. But they should have the ability to have a real choice.
It doesn't mean that you spend your whole life self-controlling, obviously. A life that's all self-control can be as dismal as a life without any self-control. But it means that you need to have the skills plus the motivation if you want to really optimize your opportunities.
It began when my children were young. I was a young faculty member at Stanford, and my three daughters were very closely spaced in age, between 2 and 5 years old. And I saw this remarkable progression that every parent sees, where their children go from clearly not having any self-control competencies, to, by the time they're 4 or 5, being able to control themselves reasonably well in many situations, and even do things like wait for dessert.
This drew me to the questions of how self-control is mastered, how it develops naturally, and what we can do to increase it in our children or ourselves. What are the mental, and — years later — what are the brain mechanisms that make emotional self-regulation and behavioral self-regulation possible?
It is really the story of resistance to temptation — the story of Adam and Eve in the Garden of Eden — that I was interested in. So that's how the marshmallow test was born.
Not in the slightest. I was just interested in the mental and behavioral processes that allow kids to self-control. I published a series of experiments in the '60s and '70s and '80s on that, and those studies in my opinion are in many ways more interesting than the follow-up findings that yes, kids who are good at self-control and delay of gratification become grownups who are good at self-regulation and self-control, and that there are substantial differences in outcomes that show that self-control is an extremely important cognitive and emotional skill set.
The good news is that this cognitive and emotional skill set is eminently teachable, particularly early in life. It's great in preschool; it's great within the first few years of life. It's great in adolescence even. And it continues to be a skill set that can be developed even when we're quite mature adults.
Yes. To me one of the most exciting findings is in kids from the South Bronx. We've done five-year studies with them, and found that self-control ability has very important protective effects against our own vulnerabilities. So, for example, people who are highly sensitive to exclusion and rejection often find themselves in a pattern that they are so anxious and worried about rejection that they actually behave in ways toward their friends and partners that make them get rejected … so it's a self-fulfilling prophecy.
Well, we followed kids from the South Bronx for five years, and found that the ones who have high rejection sensitivity, but who are able to wait to get a big bag of M&Ms a week later rather than settle for a few M&Ms now, are much better at not being, for example, as inclined toward aggressiveness. They're able to control their reactions, and so they don't see the same result from their rejection sensitivity. So, self-control has this protective effect that's hugely important, and I think that's very important for clinical psychologists to have in mind. It implies that teaching kids to improve self-control skills can have a protective effect because it allows them to deal better with whatever their own "hot spots" are.
They can be taught in many interesting and child-friendly ways. A particularly good example of that are studies that Michael Posner and his colleagues at the University of Oregon reported in the Proceedings of the National Academies of Sciences in 2005. They worked with 4- to 6-year-old kids, and the idea was to help them acquire better executive function skills using a computer game. For example, in one of the exercises, there's a cat in a rainstorm. The children's job is to use a joystick that controls an umbrella to keep the cat dry as it runs around. So this is teaching executive function: I have a goal, I have to keep the umbrella over the cat's head, I can't get distracted and start looking around, I have to keep that umbrella over the cat's head.
Posner and his crew found that five sessions of 40 minutes each with these kids led to substantial increases in their executive function and executive control, and even some increase in nonverbal IQ measures.
And, of course, there's the work with Sesame Street.
They asked me give a talk, and I wound up working with them in a minor advisory role. We decided to create situations in which Cookie Monster must learn to control himself because he's got a new goal, which is to join the cookie connoisseur gourmet club — for which you have to wait for your cookies. So Cookie Monster learns strategies, for example "framing" the marshmallows and pretending that they're just a picture, because if it's just a picture you can't eat it. Cookie Monster also comes up with the idea that if he imagines the cookies are smelly fish, then he won't want them. And so on.
The idea is for kids to be exposed to strategies that teach them what executive control is, while they're having fun.
What's crucial to remember is that there are two components to applying this work. One is that kids need to know how to do it — they need to have the cognitive control skills. But they also need the motivation to do it — they have to want to change. And that is exactly what every therapist will tell you. You don't get very far unless people want to change their behavior.
I'm not aware of systematic comparison studies that would compare that. But it doesn't hurt to go on all fronts.
At the same time, it's also wonderful that they're being implemented, for example, in the KIPP schools in New York, which I've also been involved with. There, the strategies we're talking about, including the development of the skills that are involved in building character qualities like grit and persistence, tolerance of frustration, gratitude, optimism, excitement and energy as you enter a project … all of these things are being incorporated in many school programs by educators. And I think that's the most exciting way to go.
For preschool-age children, there's a very interesting study reported in Science in 2007 by Adele Diamond and her group, on Tools of the Mind, which is a program for teaching executive function and executive skills to preschoolers. It turned out to be highly effective for kids from highly impoverished, high-stress areas.
So there's a huge amount of evidence that's accumulated from brain studies, behavioral studies and educational studies that makes it very clear that there are methods for enhancing the kinds of skills and qualities and emotion regulation that kids need if they're going to do well in school.
Step one is, if you want your children to have self-control, you need to model it. If you make promises, you need to keep them. You can't expect kids to delay gratification if you're breaking your own promises to them. Kids also need to learn that their behavior has consequences. If they behave in constructive and creative ways, the consequences are good. And if they behave in destructive ways, the consequences are not so good. They need to become aware that there's a relationship between what they're doing and what happens to them so that they can develop a sense of agency, a sense of mastery and a sense that they can control their own behavior.
Yes. They're now between their very late forties and early fifties. And we are in the middle of a study, in collaboration with a team of economists from Harvard University, where we've administered to a sample of about 110 of the original Stanford kids a very extensive set of economic outcome measures, in order to see what's the relationship between maintaining these two different patterns of high self-control over the life course versus low self-control over the life course, and economic outcomes.
Now, those are the two trajectories we've studied — consistently high control and consistently low control. But if I had another lifetime, and funding, I would also study the kids who start high, which means they demonstrated that they have good executive control in the original marshmallow study, but over the years they've gone down and down in self-control. And those who start low and went high over the life course. Those are smaller trajectories, but they exist, and they are the ones that we don't know much about.
Greater Good Science Center • Magazine • In Action • In Education
Imagine you’re a young child and a researcher offers you a marshmallow on a plate. But there’s a catch: If you can avoid eating the marshmallow for 10 minutes while no one is in the room, you will get a second marshmallow and be able to eat both. What would you do—eat the marshmallow or wait?
This is the premise of a famous study called “the marshmallow test,” conducted by Stanford University professor Walter Mischel in 1972. The experiment measured how well children could delay immediate gratification to receive greater rewards in the future—an ability that predicts success later in life. For example, Mischel found that preschoolers who could hold out longer before eating the marshmallow performed better academically, handled frustration better, and managed their stress more effectively as adolescents. They also had healthier relationships and better health 30 years later.
For a long time, people assumed that the ability to delay gratification had to do with the child’s personality and was, therefore, unchangeable. But more recent research suggests that social factors— like the reliability of the adults around them—influence how long they can resist temptation. (If children learn that people are not trustworthy or make promises they can’t keep, they may feel there is no incentive to hold out.)
Now, findings from a new study add to that science, suggesting that children can delay gratification longer when they are working together toward a common goal.
In the study, researchers replicated a version of the marshmallow experiment with 207 five- to six-year-old children from two very different cultures—Western, industrialized Germany and a small-scale farming community in Kenya (the Kikuyu). Kids were first introduced to another child and given a task to do together. Then, they were put in a room by themselves, presented with a cookie on a plate, and told they could eat it now or wait until the researcher returned and receive two cookies. (The researchers used cookies instead of marshmallows because cookies were more desirable treats to these kids.)
Some kids received the standard instructions. But others were told that they would get a second cookie only if they and the kid they’d met (who was in another room) were able to resist eating the first one. That meant if both cooperated, they’d both win.
To measure how well the children resisted temptation, the researchers surreptitiously videotaped them and noted when the kids licked, nibbled, or ate the cookie. If children did any of those things, they didn’t receive an extra cookie, and, in the cooperative version, their partner also didn’t receive an extra cookie—even if the partner had resisted themselves.
Results showed that both German and Kikuyu kids who were cooperating were able to delay gratification longer than those who weren’t cooperating—even though they had a lower chance of receiving an extra cookie. Apparently, working toward a common goal was more effective than going it alone.
“For children, being in a cooperative context and knowing others rely on them boosts their motivation to invest effort in these kinds of tasks—even this early on in development,” says Sebastian Grueneisen, coauthor of the study.
Grueneisen says that the researchers don’t know why exactly cooperating helped. It could be that relying on a partner was just more fun and engaging to kids in some way, helping them to try harder. Or it could be that having an opportunity to help someone else motivated kids to hold out. After all, a similar study found that children are able to resist temptation better when they believe their efforts will benefit another child. Or perhaps feeling responsible for their partner and worrying about failing them mattered most.
Whatever the case, the results were the same for both cultures, even though the two cultures have different values around independence versus interdependence and very different parenting styles—the Kikuyu tend to be more collectivist and authoritarian, says Grueneisen. This points toward the possibility that cooperation is motivating to everyone.
“I would be careful about making a claim that this is a human universal. But our findings point in that direction, since they can’t be explained by culture-specific socialization,” he says.
This would be good news, as delaying gratification is important for society at large, says Grueneisen. Achieving many social goals requires us to be willing to forego short-term gain for long-term benefits. For example, preventing future climate devastation requires a populace that is willing to do with less and reduce their carbon footprint now.
Further testing is needed to see if setting up cooperative situations in other settings (like schools) might help kids resist temptations that keep them from succeeding—something that Grueneisen suspects could be the case, but hasn’t yet been studied. Or if emphasizing cooperation could motivate people to tackle social problems and work together toward a better future, that would be good to know, too.
“Cooperation is not just about material benefits; it has social value,” says Grueneisen. “In situations where individuals mutually rely on one another, they may be more willing to work harder in all kinds of social domains.”
Jill Suttie, Psy.D. , is Greater Good ’s former book review editor and now serves as a staff writer and contributing editor for the magazine. She received her doctorate of psychology from the University of San Francisco in 1998 and was a psychologist in private practice before coming to Greater Good .
Here’s a psychological challenge for anyone over 30 who thinks “kids these days” can’t delay their personal gratification: Before you judge, wait a minute.
It turns out that a generation of Americans now working their way through middle school, high school and college are quite able to resist the prospect of an immediate reward in order to get a bigger one later. Not only that, they can wait a minute longer than their parents’ generation, and two minutes longer than their grandparents’ generation could.
It may not sound like much, but being able to hold out for an extra minute or two at a young age may serve them well in the long run. Research suggests that superior results on a delayed-gratification task during the toddler years is associated with better performance in school and in jobs, healthier relationships, and even fewer chronic diseases.
Those findings emerge from a new effort to understand how children’s ability to hold out for the promise of more has changed over time. The study , published this week in the journal Developmental Psychology, resurrected an experiment that’s become a developmental psychology classic : the so-called marshmallow test.
Pioneered in the 1960s by a young Stanford psychology professor named Walter Mischel , the marshmallow test left a child between the ages of 3 and 5 alone in a room with two identical plates, each containing different quantities of marshmallows, pretzels, cookies or another delicious treat. Before leaving the room “to do some work,” the adult researcher instructed the child that the single treat on one plate could be eaten at any time. But if the child could wait for him to return before eating it, the researcher added, she could have the second, bigger treat instead.
After the experimenter closed the door on the subject, researchers on the other side of a two-way mirror monitored the child’s bout with temptation and recorded how long he or she could hold out before licking or eating the treat.
Replicated many times and followed up by a wide range of researchers, the marshmallow test has earned recognition as a powerful predictor of future performance — at least among the white children of well-educated parents. Compared to kids who lunged for the early reward, those who held out for a bigger prize did better in school, got higher SAT scores, had higher self-esteem and better emotional coping skills, and were less likely to abuse drugs.
Other studies found that children unable to defer gratification were more likely to be become overweight or obese 30 years later and were in worse general health in adulthood .
The results focused psychologists, early-childhood educators and parents on the key role that self-regulation and executive function can play in a child’s prospects, and on the need to nurture those skills well before kindergarten.
The new study — conducted by Mischel (now at Columbia University) and colleagues around the country — suggests that focus has payed off.
Among the 165 children who participated in the first round of experiments at Stanford from 1965 to 1969, the task tended to be either very hard or pretty easy: close to 30% gobbled up the single treat within 30 seconds of the researchers’ departure from the room, while just over 30% were able to wait the 10 minutes that was the outer limit of the researcher’s absence. Most of the children who did not hold out for 10 minutes ate the treat within six minutes.
These original subjects are now between 52 and 58 years old.
When the marshmallow experiment was replicated in a group of 135 New York City preschoolers from 1985 to 1989, changes seemed to be afoot. About 16% of the kids held out for just 30 seconds or less before snarfing the treat, and about 38% held out for 10 minutes. In between, the trend was for longer holdouts.
These subjects are now between 32 and 38 years old.
By the time University of Minnesota psychologist Stephanie M. Carlson and colleagues at the University of Washington in Seattle ran the exact same experiment with 540 kids from 2002 to 2012, the changes appeared to be real. Close to 60% of the children tested held out the full 10 minutes for a bigger reward. And only about 12% claimed their reward in the first half-minute.
These kids — like the two earlier cohorts, overwhelmingly white from families with relatively high incomes and educational attainment — are now between 11 and 21 years old.
On average, they waited two minutes longer (during a 10-minute period) than those from the 1960s before seizing their reward. And they waited one minute longer than those tested in the 1980s.
Surprised? You’re not alone.
In a survey conducted before performing the new analysis, the study authors found that adults in the United States “generally intuit” that children today are less tolerant of delayed gratification and less self-controlled than children were 50 years ago.
Roughly three-quarters of a representative sample of U.S. adults did not believe that children these days would show much self-restraint for a better reward. And parents — Latino parents especially — were overwhelmingly convinced their own kids would not delay gratification as long as they would have when they were 4 years old.
Carlson wasn’t so sure. On the one hand, she wondered how kids’ self-control would hold up under the influence of daily television and amid a dramatic rise in attention deficit and hyperactivity disorder (ADHD) diagnoses.
On the other, she knew that research has chronicled a steady rise in kids’ IQ scores — the so-called Flynn effect — which correlates with executive function. And she knew that a growing portion of kids’ screen time, including video games and some social media , can help them learn to manipulate language and other abstractions to accumulate social approval and other rewards.
Higher preschool enrollment and changes in parenting styles, including the rise of the empowered child, also might contribute to generational improvements in kids’ ability to delay gratification, Carlson said. After all, only 15.7% of all 3- and 4-year-olds in the United States attended preschool in 1968. By the year 2000, more than half of kids that age attended schools that stressed social skills and self-control as cornerstones of educational readiness.
Plus, Carlson looked at her own daughters, now 19 and 22, and thought to herself, the kids just might be OK.
The findings “do make me hopeful,” she said. Not only have qualities like perseverance and self-control not disappeared; a simple and unchanged measure of those qualities — the marshmallow test — has withstood many trials, including the test of time.
“Delay of gratification is still a good bellwether of these self-regulation and executive function skills, and we’re learning more every day about how important they are for school readiness and achievement,” Carlson said.
The next challenge, she added, will be to take the marshmallow test into more diverse communities and understand better if it has the same predictive power in kids who are not white, affluent and from well-educated families.
@LATMelissaHealy
MORE IN SCIENCE
Surprising discovery about viruses and Alzheimer’s disease could open new avenues for treatment
Extinct species of gibbon discovered in an unlikely place — the tomb of a Chinese noblewoman
What it’s like to be interviewed for a job by Koko the gorilla: ‘She had a lot to say’
Sept. 9, 2024
Aug. 15, 2024
Aug. 9, 2024
Melissa Healy is a former health and science reporter with the Los Angeles Times who wrote from the Washington, D.C., area. She covered prescription drugs, obesity, nutrition and exercise, and neuroscience, mental health and human behavior. She was at The Times for more than 30 years, and has covered national security, environment, domestic social policy, Congress and the White House.
Sept. 19, 2024
Science & Medicine
Sept. 18, 2024
Sept. 17, 2024
A new study finds that in a study of self control, the perception of trustworthiness matters
Sarah Zielinski
A four-year-old girl reenacts the marshmallow test (Credit: J. Adam Fenster / University of Rochester)
When I wrote about the marshmallow test several years ago , it seemed so simple:
A child was given a marshmallow and told he could either ring a bell to summon the researcher and get to eat the marshmallow right away or wait a few minutes until the researcher returned, at which time the child would be given two marshmallows. It’s a simple test of self control, but only about a third of kids that age will wait for the second marshmallow. What’s more interesting, though, is that success on that test correlates pretty well with success later in life. The children who can’t wait grow up to have lower S.A.T. scores, higher body mass indexes, problems with drugs and trouble paying attention.
The initial finding hasn’t been overturned, but a new study in the journal Cognition is adding a layer of complexity to the test with the finding that whether the child perceives the researcher as trustworthy matters.
“Our results definitely temper the popular perception that marshmallow-like tasks are very powerful diagnostics for self-control capacity,” Celeste Kidd, a doctoral candidate in brain and cognitive sciences at the University of Rochester and the study’s lead author, said in a statement .
Kidd and her colleagues started their experiment by adding a step before giving their group of 28 three- to five-year-old children the marshmallow test: Similar to the marshmallow test, the children were given an art task, with a researching placing before a child either a well-worn set of crayons or a small sticker. The children were promised a better art supply (new crayons or better stickers) if they waited for the researcher to come back. With half of the children, though, the researcher didn’t follow up on that promise, telling the kid that better supplies were unavailable.
And then the researcher administered the marshmallow test.
Children who had been primed to believe that the researcher was reliable waited an average of 12 minutes before eating the marshmallow, but those in the “unreliable” group waited only three minutes. What’s more, nine out of 14 children in the “reliable” group were able to wait the full 15 minutes for the researcher to return, while only one kid in the unreliable group was able to wait that long.
“Delaying gratification is only the rational choice if the child believes a second marshmallow is likely to be delivered after a reasonably short delay,” Kidd said. Self control isn’t so important, it seems, if you don’t think there’s anything worth controlling yourself for.
Kidd got interested in the test after volunteering at a homeless shelter. “There were lots of kids staying there with their families. Everyone shared one big area, so keeping personal possessions safe was difficult,” Kidd said. “When one child got a toy or treat, there was a real risk of a bigger, faster kid taking it away. I read about these studies and I thought, ‘All of these kids would eat the marshmallow right away.’ ”
The study doesn’t invalidate the marshmallow test– willpower is still important–but it does mean that people shouldn’t look at kids who fail the test as being instantly doomed to failure. Instead, parents of kids who appear to lack self control might want to look more closely at why they would eat the marshmallow–is it because they can’t wait or because they can’t trust that the next marshmallow will appear?
Get the latest Science stories in your inbox.
Sarah Zielinski | | READ MORE
Sarah Zielinski is an award-winning science writer and editor. She is a contributing writer in science for Smithsonian.com and blogs at Wild Things, which appears on Science News.
When kids “pass” the marshmallow test, are they simply better at self-control or is something else going on? A new UC San Diego study revisits the classic psychology experiment and reports that part of what may be at work is that children care more deeply than previously known what authority figures think of them.
In the marshmallow test, young children are given one marshmallow and told they can eat it right away or, if they wait a while, while nobody is watching, they can have two marshmallows instead. The half-century-old test is quite well-known. It’s entered everyday speech, and you may have chuckled at an online video or two in which children struggle adorably on hidden camera with the temptation of an immediate treat.
But the real reason the test is famous (and infamous) is because researchers have shown that the ability to wait — to delay gratification in order to get a bigger reward later — is associated with a range of positive life outcomes far down the line, including better stress tolerance and higher SAT scores more than a decade later. Whether or not it’s just this ability to wait or a host of other socioeconomic and personality factors that are predictive is still up for debate, but the new study , published in the journal Psychological Science, shows that young children will wait nearly twice as long for a reward if they are told their teacher will find out how long they waited.
This is the first demonstration that what researchers call “reputation management” might be a factor.
“The classic marshmallow test has shaped the way researchers think about the development of self-control, which is an important skill,” said Gail Heyman, a University of California, San Diego professor of psychology and lead author on the study. “Our new research suggests that in addition to measuring self-control, the task may also be measuring another important skill: awareness of what other people value.”
The classic marshmallow test is featured in this online video. But what are we really seeing: Is it kids’ ability to exercise self-control or something else? Video by Igniter Media.
In fact, she said, “one reason for the predictive power of delay-of-gratification tasks may be that the children who wait longer care more about what people around them value, or are better at figuring it out.”
For their study, Heyman and her colleagues from UC San Diego and Zhejiang Sci-Tech University conducted two experiments with a total of 273 preschool children in China aged 3 to 4 years old.
The researchers told the children that they could earn a small reward immediately or wait for a bigger one. (Instead of a marshmallow, the researchers used a sticker reward in one of the experiments and a cookie in the other.) Children were assigned to either a “teacher condition” in which they were told that their teacher would find out how long they waited, a “peer condition” in which they were told that a classmate would find out how long they waited, or a “standard condition” that had no special instructions.
Children waited longer in both the teacher and peer conditions than in the standard condition. The difference was about twice as great in the teacher condition as compared to the peer condition. The researchers interpret these results to mean that when children decide how long to wait, they make a cost-benefit analysis that takes into account the possibility of getting a social reward in the form of a boost to their reputation. These findings suggest that the desire to impress others is strong and can motivate human behavior starting at a very young age.
The researchers were surprised by their findings because the traditional view is that 3- and 4-year-olds are too young to care what care what other people think of them.
“The children waited longer in the teacher and peer conditions even though no one directly told them that it’s good to wait longer,” said Heyman. “We believe that children are good at making these kinds of inferences because they are constantly on the lookout for cues about what people around them value. This may take the form of carefully listening to the evaluative comments that parents and teachers make, or noticing what kinds of people and topics are getting attention in the media.”
The study’s other co-authors are Fengling Ma, Dan Zeng and Fen Xu of Zhejiang Sci-Tech University and Brian J. Compton of UC San Diego.
The contributions of Fengling Ma were supported by grants from the National Natural Science Foundation of China (31400892), from the Natural Science Foundation of Zhejiang Province (LY17C090010) and from the China Scholarship Council.
The State of Black California 2024 report analyzes two decades of socioeconomic status data.
Breathing wildfire smoke harms our lungs — but the damage doesn't stop there. And UC research finds that wildfire smoke could be 10x more toxic than smoke from "everyday" sources like traffic and…
The “marshmallow test” – the famed psychological experiment designed to measure children’s self-control – may not predict life outcomes as much as previously thought, a team of scientists has concluded from results of what they call a “conceptual replication” of the classic research.
The findings are published in Psychological Science . The experiment, led by Tyler W. Watts of New York University, took a modified approach to the test created by APS Past President Walter Mischel in the 1960s.
Unlike direct replications that aim to reproduce the precise methods used in an original study, Watts and his colleagues used a different approach to test the same underlying relationship between delay of gratification and long-term outcomes. The researchers obtained data from a larger and more diverse sample of children – in their study, children were considered to have delayed gratification if they waited 7 minutes to eat the marshmallow, while the participants in Mischel’s sample had been asked to wait up to 20 minutes.
Mischel’s original studies were designed simply to study children’s self-control. In the early tests, preschool children sat alone in a room with a single marshmallow placed on table in front of them. They were told that if they could resist the temptation to eat the marshmallow (or a cookie, pretzel, or other candy in subsequent versions) for a certain amount of time, they would receive two instead of one. Mischel and collaborators APS Fellow Yuichi Shoda and Philip Peake followed up with a subset of those children when they reached adolescence and found that those who had waited to get two marshmallows as children tended to have better academic achievement and life success compared to those who didn’t wait. Their work earned them a 2015 Golden Goose Award in recognition of their extensive contributions to the understanding of self-control.
Mischel and his colleagues have always said that tests with a larger sample of children might yield smaller effect sizes, and that the home environment could influence academic outcomes more than what the tests could show.
For the new study, Watts and colleagues examined longitudinal data from more than 900 children participating in the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development, a geographically diverse dataset widely used in the field of developmental research.
Much of their analysis focused on a subsample of children whose mothers had not completed college by the time the child was born. This subsample was more representative of the racial and economic makeup of the broader population of children in the United States compared with the original marshmallow experiments, though Hispanic children were still underrepresented, Watts and colleagues noted.
The results showed that, although children who were able to wait and resist temptation tended to have stronger math and reading skills in adolescence, the association was small and disappeared after the researchers controlled for characteristics of the child’s family and early environment. And there was no indication that the ability to delay gratification predicted later behaviors or measures of personality.
The authors concluded that interventions focused only on teaching young children to delay gratification are likely to be ineffective.
“Our findings suggest that an intervention that alters a child’s ability to delay, but fails to change more general cognitive and behavioral capacities, will probably have very small effects on later outcomes,” Watts explained. “If intervention developers hope to generate the kinds of improvements associated with the original marshmallow study, it is likely to be more fruitful to target the broader cognitive and behavioral abilities related to gratification delay.”
Watts, T.W., Duncan, G.J., and Quan, H. (2018). Revisiting the Marshmallow Test: A Conceptual Replication Investigating Links Between Early Delay of Gratification and Later Outcomes. Psychological Science , doi.org/10.1177/0956797618761661
See Walter Mischel discuss the history of the marshmallow test and other aspects of his storied career in an interview for Inside the Psychologist’s Studio .
Interesting research. I would love to examine the data. From the top of my head…culture,trauma, experience…have a lot to do with delayed gratification versus immediate gratification. Not to mention how the researcher might have influenced the research and how participants perceived researchers.
“Much of their analysis focused on a subsample of children whose ***mothers had not completed college by the time the child was born.***”
That bit of information more likely than not translates into disadvantaged families. And that translates into a lot of other things that affect the outcome of the research.
Social science studies are always going to have confounding variables that are impossible to control. In many ways, Mischel’s original study population, albeit a small sample size, was relatively homogeneous (white preschoolers enrolled at Stanford’s preschool in the 60’s). While the results may not be generalized, one could argue that the dependent variable is less likely to suffer from outside influences in this population and that the results apply within the group since the household factors for these children may have been very similar.
I like Rob’s comment. I would like to see a study that looks at different homogenous groups and explores whether more self-control within each homogenous group leads to better success later in life. For example, do kids from a disadvantaged background with more self-control do better later in life than other disadvantaged kids with less self-control?
My own experience says that self-control is likely to impact success in life.
APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines .
Please login with your APS account to comment.
The James McKeen Cattell Fund has recognized APS Fellow Stephanie M. Carlson, C. Daryl Cameron, Robert Hampton, and Kevin Holmes as recipients of its Sabbatical Fund Fellowship for 2023–2024.
Lesson plans about self-control in adolescents and how loyalty can lead us to act ethically or unethically.
When trying something new, discomfort might feel like a sign we’re in over our heads. Embracing these feelings as a part of learning could help motivate personal growth.
Cookie | Duration | Description |
---|---|---|
__cf_bm | 30 minutes | This cookie, set by Cloudflare, is used to support Cloudflare Bot Management. |
Cookie | Duration | Description |
---|---|---|
AWSELBCORS | 5 minutes | This cookie is used by Elastic Load Balancing from Amazon Web Services to effectively balance load on the servers. |
Cookie | Duration | Description |
---|---|---|
at-rand | never | AddThis sets this cookie to track page visits, sources of traffic and share counts. |
CONSENT | 2 years | YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data. |
uvc | 1 year 27 days | Set by addthis.com to determine the usage of addthis.com service. |
_ga | 2 years | The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors. |
_gat_gtag_UA_3507334_1 | 1 minute | Set by Google to distinguish users. |
_gid | 1 day | Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously. |
Cookie | Duration | Description |
---|---|---|
loc | 1 year 27 days | AddThis sets this geolocation cookie to help understand the location of users who share the information. |
VISITOR_INFO1_LIVE | 5 months 27 days | A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface. |
YSC | session | YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages. |
yt-remote-connected-devices | never | YouTube sets this cookie to store the video preferences of the user using embedded YouTube video. |
yt-remote-device-id | never | YouTube sets this cookie to store the video preferences of the user using embedded YouTube video. |
yt.innertube::nextId | never | This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen. |
yt.innertube::requests | never | This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen. |
IMAGES
VIDEO
COMMENTS
A classic study on delayed gratification in children, conducted by Walter Mischel and others at Stanford University in 1970. The experiment involved offering a child a choice between one small reward or two if they waited for 15 minutes.
Learn how the marshmallow test measures a child's ability to delay gratification and how it relates to later life outcomes. Find out the original and replication studies, the experimental conditions, and the results of the Stanford experiments.
The Marshmallow Experiment tested children's ability to wait for a second marshmallow instead of eating one immediately. The study found that those who delayed gratification had higher success in life, but the environment also influenced their choices.
The marshmallow test is a classic experiment that measures children's ability to resist an immediate reward for a larger one. Learn how the test was created, what factors influence children's decisions, and how the test relates to future outcomes.
The Stanford marshmallow experiment was a study that tested children's ability to delay gratification by offering them a snack and a reward. It found that self-control predicts various positive outcomes later in life, such as academic success and health.
June 1, 2018. The marshmallow test is one of the most famous pieces of social-science research: Put a marshmallow in front of a child, tell her that she can have a second one if she can go 15 ...
by Michaeleen Doucleff, NPRIn the the 1960s, a Stanford psychologist ran an experiment to study children's self-control.It's called the marshmallow test. And it's super simple.Kids ages 3 to 5 choose a treat — an Oreo cookie, a pretzel stick or a marshmallow. Then researchers give the child brief instructions: You can eat the treat now, but if you can wait for me to return, you'll get two ...
A recent replication of the classic Marshmallow Experiment found that delay of gratification at age 5 did not predict later success or performance, after controlling for other factors. The ...
The Marshmallow Experiment Original research done at Sandford in the 1960s and 1970s. 3-4 year old children were placed in a room alone at a table and given one marshmallow -and were told if they waited 15 minutes they'd be given a second marshmallow, and they could eat it now. The purpose was to measure the capacity for delayed gratification.
It began in the early 1960s at Stanford University's Bing Nursery School, where Mischel and his graduate students gave children the choice between one reward (like a marshmallow, pretzel, or ...
In a series of studies based on children who attended a preschool on the Stanford University campus, Mischel, Shoda, and colleagues showed that under certain conditions, a child's success in delaying the gratification of eating marshmallows or a similar treat was related to later cognitive and social development, health, and even brain structure (Casey et al., 2011; Mischel et al., 2010 ...
Learn how the Stanford marshmallow test, a classic experiment in developmental psychology, can help you improve your personal productivity and willpower. Ed Batista, an executive coach and ...
Learn how Walter Mischel's groundbreaking research at Bing in the 1960s and 1970s revealed the secrets of self-control and delayed gratification in preschool children. Find out how the Bing studies influenced education, psychology and brain science, and how they are still relevant today.
Learn about the famous experiment that tested self-control and success in children, and how it influenced psychology. Find out the results, the criticisms, and the impact of the Marshmallow Test.
Walter Mischel, PhD, is a psychologist who has studied how children and adults can resist temptation and achieve their goals. He explains his research on the marshmallow test, its implications for education and mental health, and his new book on self-control strategies.
A study shows that children can delay gratification longer when they are working together toward a common goal, even if they have a lower chance of getting a second cookie. The findings suggest that cooperation is motivating and beneficial for children's self-control and social development.
Pioneered in the 1960s by a young Stanford psychology professor named Walter Mischel, the marshmallow test left a child between the ages of 3 and 5 alone in a room with two identical plates, each ...
Kidd and her colleagues started their experiment by adding a step before giving their group of 28 three- to five-year-old children the marshmallow test: Similar to the marshmallow test, the ...
Children who wait longer for a bigger reward in the marshmallow test may care more about what others think of them, a new study suggests. The research shows that 3- and 4-year-olds can make cost-benefit analyses based on social cues and values.
A new study by Watts and colleagues found that children's ability to delay gratification did not predict their academic or life success, unlike the original marshmallow test by Mischel and Shoda. The authors suggest that interventions should target broader cognitive and behavioral capacities rather than delay of gratification alone.
Watch how children react to the famous Stanford experiment that tests their ability to delay gratification. See the hilarious results of hiding two cameras in the room and rewarding them with ...
What is the marshmallow test and what does it reveal about children's self-control and future success? Watch this video to find out.
Stanford University's Experiment: 1st Round. Carried out by Dr. Walter Mischel's team of professors (End of 1960 - 1970s) It became well-known as the Stanford Marshmallow Experiment Stanford University affiliated Bing Nursery School Conducted with 653 children between the ages of 4 to 6 Published research results in 1981.