2026-06-25

An entire Herculaneum scroll has been read for the first time

Technical approach & ML role

Scrolls are CT-scanned at a powerful synchrotron beamline; raw data can reach ~260 TB for a large scroll.
Workflow: segment and virtually unwrap layers → render surfaces → detect ink using ML and physically based rendering.
Ink is usually carbon-based and not directly visible in X-ray; ML models work on small 3D CT chunks and detect subtle textural differences.
Team stresses that training data quality (manual annotations of papyrus surfaces and ink) matters more than model choice.
ML can “hallucinate” at the stroke level (slightly extended lines, filled gaps), but not full grammatical Greek/Latin; philologists remain essential.

Scope, progress & limitations

About 30 scrolls have been scanned so far; estimated ~350 mostly intact scrolls, ~1000 damaged, plus many fragments; the vast majority remain in Italy.
Scanning a big scroll takes days; human refinement and annotation take months even with automation.
Funding and beamtime costs are major bottlenecks; current work is paid from donations and private funding, not government grants.
Estimates of how long it will take to read the collection are described as highly uncertain and scroll-dependent.

What was actually “read”

The newly announced scroll (PHerc. 1667) is the compact inner core of a roll badly damaged by earlier physical opening attempts.
Some commenters argue “entire scroll” is misleading since outer layers are gone; others clarify the team means “the entire surviving core.”
The text is a philosophical treatise on ethics, probably Stoic (2nd c. BC), discussing human nature, impulse, moral progress, and naming Aristocreon.
Translation in the paper is partial and fragmentary; several columns are heavily damaged with only scattered phrases.

Potential impact & expected content

Many hope for lost works (e.g., early Greek philosophers, historians, technical treatises) or alternative political perspectives suppressed by later transmission.
Others think the main effect will be filling in details rather than radically overturning our picture of antiquity.
Early expectations that the library is “just Epicurean” are questioned; only one of three newly read texts appears clearly Epicurean so far.

Broader debates & reactions

Strong enthusiasm: seen as one of the clearest examples of ML/AI tangibly advancing scholarship, with overlap to medical imaging techniques.
Some skepticism about hype, title accuracy, and whether focus on Herculaneum crowds out attention to other large corpora (e.g., Mesopotamian tablets).
Extended side-discussion on translation philosophy, bias, and why bilingual editions matter; consensus that all translation involves interpretation.
Archaeology’s destructiveness and whether to delay excavation for future tech are debated; current practice often leaves portions unexcavated intentionally.

Related topics