2024-08-02

The Marshmallow Test does not reliably predict adult functioning

Scope and findings of the new study

Commenters note the new longitudinal work (up to age ~26 and beyond in follow‑ups) finds:
- Only small correlations between marshmallow delay and adult BMI and education.
- Most predictive power disappears after controlling for other variables.
Some argue that an r≈0.17 is “non‑zero but weak,” so the headline “does not reliably predict” is broadly fair but not a total null result.
Others warn that “controlling for” things like SES or IQ may remove part of the causal pathway if delay of gratification contributes to those outcomes.

What the marshmallow test might actually measure

Many suggest it measures:
- Trust in adults/researchers and perceived reliability of promises.
- Prior experience with scarcity and “take it now or lose it” environments.
- Desire to please authority or sensitivity to subtle experimenter cues.
- How much the child actually values marshmallows (some kids don’t care).
Several point out that one-off behavior in a contrived lab setup is a noisy proxy for a broad trait like impulse control or “future orientation.”

Socioeconomic, environmental, and genetic debates

A frequently cited 2018 conceptual replication found SES explained much of the original effect; commenters connect this to:
- Unpredictable access to resources making immediate payoff rational.
- Differences in parental support, schooling, nutrition, and stability.
Others argue traits like time preference, self‑control, and conscientiousness are heritable and may drive both income and delay behavior; back‑and‑forth disputes focus on:
- How to interpret twin and heritability studies.
- Whether controlling for household income is valid or a statistical mistake.
Several emphasize that behavior can be rational given local risks (e.g., broken promises, unstable environments, “counterparty risk”).

Trust, ethics, and attitudes toward psychology

Numerous anecdotes describe not trusting experimenters, being deceived about rewards, or feeling “tricked,” reinforcing the idea that trust is central.
Broader skepticism about social psychology surfaces:
- Many famous effects (including Dunning–Kruger–style findings) and lab studies are said to replicate poorly or rely on fragile statistics and p‑hacking.
- Some see this as a symptom of incentives for flashy, media‑friendly results.
- Others reply that replication attempts and falsifications are science working as intended, though the process is slow and messy.

Cultural impact and lay intuitions

Commenters note the test has become pop‑culture “wisdom” (books, talks, religious sermons, YouTube), often overstating its predictive power.
Many still believe delayed gratification matters for success, but doubt that a single marshmallow test in preschool can meaningfully forecast adult outcomes.

Related topics