Project Gutenberg – keeps getting better
Site Improvements & Design
- Recent major redesign praised: better mobile styling, EPUB3 support, improved book pages coming soon.
- Users appreciate the “no-frills,” fast, JS-optional design and want that preserved.
- Some miss the ultra-simple old layout on e-ink devices; requests for a “lite”/minimal version.
- Reported UI bugs: odd scrolling of front-page book lists on mobile, Android Chrome menu not closing, tiny scrollbars clipping text.
- Suggestions: better pagination, line length control, notes, and easier search/filtering by original publication date.
Formats, Typesetting, and Quality
- Historically text-heavy; now most titles have EPUB3, HTML, plain text; PDFs “in the works” and some want them, others warn of poor e-reader support.
- OCR errors remain a concern; Distributed Proofreaders is recommended for higher quality text.
- Handling of illustrations depends on upstream scan quality; public-domain constraints apply.
- Internal git histories per book exist; users request public version histories and clearer errata workflows.
Access, Scraping, and Infrastructure
- Official recommendation: use RDF/XML catalog dumps, tarballs, /cache/epub/feeds, OPDS, ZIMs instead of crawling.
- Heavy bot/AI crawler traffic is degrading performance; patterns resemble DDoS from many single-request IPs.
- Mitigations debated: IP blocking, AS blocking, captchas, proof-of-work, third-party anti-bot tools; concerns about usability, battery drain, and misclassifying real users.
- Idea of feeding scrapers bogus data is strongly criticized as dangerous if humans are misclassified.
Licensing, Ecosystem, and E-Readers
- PG license requires a 20% royalty on profits if their license text is retained with commercial redistributions; otherwise pure public-domain text can be used freely.
- E-reader vendors rarely expose PG as a “store,” likely due to incentives to push paid content.
- Workarounds: built-in browsers, KOReader, Calibre, Standard Ebooks, LibriVox integrations, and various third-party apps.
Legal and Geographic Restrictions
- Past and present blocks in Germany and Italy discussed.
- Italian block stems from a criminal case targeting piracy where PG domains were included; interaction with national copyright (especially translators’ rights) is contentious and unresolved.
- Some argue for clearer legal status messages (e.g., HTTP 451) instead of generic 404s.
Community Sentiment
- Strong, repeated appreciation for PG as a public-good, long-lived, volunteer-driven project.
- Users share personal stories of learning, accessibility, and lifelong reading enabled by PG.