2026-04-21

Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica

Project & Goals

Creator rebuilt the 1911 Encyclopædia Britannica into a structured, navigable site with ~37k articles, section navigation, cross-references, contributor index, page references, and links to original scans.
Aim is to preserve the feel of the original while making it actually usable and searchable.

Licensing, Data Access, and Reuse

Underlying 1911 text is public domain.
The site’s structured reconstruction (parsing, linking, indexing) is new work; no formal license yet.
Casual/small-scale use is welcomed; for bulk use (datasets, training, redistribution), the creator prefers people get in touch.
Some commenters argue that “sweat of the brow” processing may not be copyrightable in the U.S., while others simply point to existing PD/CC sources (e.g., Gutenberg, Wikisource).

UX, Bugs, and Feature Requests

Reported issues: search box on article pages not working in some browsers (later fixed), escaping bugs (HTML entities), broken tables, glyphs unsupported by the font (℔), Zurich canton/city disambiguation bug, TOC encoding glitches.
Requested features:
- EPUB export and/or bulk download or mirror.
- Clearer entry points from the home page; logo/title linking to home.
- Side-by-side text + scan view or thumbnails.
- Wikipedia-style in-article links and “adjacent article” browsing.

Data Sources, Structure, and Fidelity

Creator did not OCR the whole work; started from Wikisource text and built a pipeline to clean, segment, and re-link to page images.
Some users note fidelity issues: missing math in at least one article and mis-attached footnotes compared to Wikisource.

Comparisons & Related Projects

Thread references Wikisource’s EB1911, Project Gutenberg’s text, other historical dictionaries/encyclopedias, and parallel efforts on earlier/later Britannica editions and other classic reference works.

Historical Value and Problematic Content

Many appreciate the distinctive, opinionated prose and pre–World War I worldview.
Users highlight both delightful passages (literary enthusiasm, early atomic/fusion speculation, cosmology debates) and disturbing ones (racist claims, sexist medical advice, torture descriptions).
Several emphasize the value of old works for understanding past beliefs, including those now seen as immoral or incorrect.

LLMs, Research, and Use Cases

Some want the dataset to train models to mimic 1911 Britannica style.
Others propose loading structured data into XML/DB tools for large-scale queries.
There is debate over using LLMs to summarize and “modernize” dense historical prose vs. reading directly for intellectual exercise.

Related topics