Project Gutenberg – keeps getting better

Site Improvements & Design

  • Recent major redesign praised: better mobile styling, EPUB3 support, improved book pages coming soon.
  • Users appreciate the “no-frills,” fast, JS-optional design and want that preserved.
  • Some miss the ultra-simple old layout on e-ink devices; requests for a “lite”/minimal version.
  • Reported UI bugs: odd scrolling of front-page book lists on mobile, Android Chrome menu not closing, tiny scrollbars clipping text.
  • Suggestions: better pagination, line length control, notes, and easier search/filtering by original publication date.

Formats, Typesetting, and Quality

  • Historically text-heavy; now most titles have EPUB3, HTML, plain text; PDFs “in the works” and some want them, others warn of poor e-reader support.
  • OCR errors remain a concern; Distributed Proofreaders is recommended for higher quality text.
  • Handling of illustrations depends on upstream scan quality; public-domain constraints apply.
  • Internal git histories per book exist; users request public version histories and clearer errata workflows.

Access, Scraping, and Infrastructure

  • Official recommendation: use RDF/XML catalog dumps, tarballs, /cache/epub/feeds, OPDS, ZIMs instead of crawling.
  • Heavy bot/AI crawler traffic is degrading performance; patterns resemble DDoS from many single-request IPs.
  • Mitigations debated: IP blocking, AS blocking, captchas, proof-of-work, third-party anti-bot tools; concerns about usability, battery drain, and misclassifying real users.
  • Idea of feeding scrapers bogus data is strongly criticized as dangerous if humans are misclassified.

Licensing, Ecosystem, and E-Readers

  • PG license requires a 20% royalty on profits if their license text is retained with commercial redistributions; otherwise pure public-domain text can be used freely.
  • E-reader vendors rarely expose PG as a “store,” likely due to incentives to push paid content.
  • Workarounds: built-in browsers, KOReader, Calibre, Standard Ebooks, LibriVox integrations, and various third-party apps.

Legal and Geographic Restrictions

  • Past and present blocks in Germany and Italy discussed.
  • Italian block stems from a criminal case targeting piracy where PG domains were included; interaction with national copyright (especially translators’ rights) is contentious and unresolved.
  • Some argue for clearer legal status messages (e.g., HTTP 451) instead of generic 404s.

Community Sentiment

  • Strong, repeated appreciation for PG as a public-good, long-lived, volunteer-driven project.
  • Users share personal stories of learning, accessibility, and lifelong reading enabled by PG.