An entire Herculaneum scroll has been read for the first time

Technical approach & ML role

  • Scrolls are CT-scanned at a powerful synchrotron beamline; raw data can reach ~260 TB for a large scroll.
  • Workflow: segment and virtually unwrap layers → render surfaces → detect ink using ML and physically based rendering.
  • Ink is usually carbon-based and not directly visible in X-ray; ML models work on small 3D CT chunks and detect subtle textural differences.
  • Team stresses that training data quality (manual annotations of papyrus surfaces and ink) matters more than model choice.
  • ML can “hallucinate” at the stroke level (slightly extended lines, filled gaps), but not full grammatical Greek/Latin; philologists remain essential.

Scope, progress & limitations

  • About 30 scrolls have been scanned so far; estimated ~350 mostly intact scrolls, ~1000 damaged, plus many fragments; the vast majority remain in Italy.
  • Scanning a big scroll takes days; human refinement and annotation take months even with automation.
  • Funding and beamtime costs are major bottlenecks; current work is paid from donations and private funding, not government grants.
  • Estimates of how long it will take to read the collection are described as highly uncertain and scroll-dependent.

What was actually “read”

  • The newly announced scroll (PHerc. 1667) is the compact inner core of a roll badly damaged by earlier physical opening attempts.
  • Some commenters argue “entire scroll” is misleading since outer layers are gone; others clarify the team means “the entire surviving core.”
  • The text is a philosophical treatise on ethics, probably Stoic (2nd c. BC), discussing human nature, impulse, moral progress, and naming Aristocreon.
  • Translation in the paper is partial and fragmentary; several columns are heavily damaged with only scattered phrases.

Potential impact & expected content

  • Many hope for lost works (e.g., early Greek philosophers, historians, technical treatises) or alternative political perspectives suppressed by later transmission.
  • Others think the main effect will be filling in details rather than radically overturning our picture of antiquity.
  • Early expectations that the library is “just Epicurean” are questioned; only one of three newly read texts appears clearly Epicurean so far.

Broader debates & reactions

  • Strong enthusiasm: seen as one of the clearest examples of ML/AI tangibly advancing scholarship, with overlap to medical imaging techniques.
  • Some skepticism about hype, title accuracy, and whether focus on Herculaneum crowds out attention to other large corpora (e.g., Mesopotamian tablets).
  • Extended side-discussion on translation philosophy, bias, and why bilingual editions matter; consensus that all translation involves interpretation.
  • Archaeology’s destructiveness and whether to delay excavation for future tech are debated; current practice often leaves portions unexcavated intentionally.