Show HN: Kage – Shadow any website to a single binary for offline viewing

Purpose and core idea

  • Tool mirrors entire websites (not just single pages) into offline-friendly bundles.
  • Captures the rendered DOM after JS execution, strips scripts, and rewrites resources so snapshots work without network access.

Comparisons to existing tools

  • Seen as a modern alternative to HTTrack/wget for JS-heavy sites (e.g., SPAs, Next.js).
  • Compared frequently with SingleFile/SingleFileZ and archiveweb.page:
    • SingleFile is praised for high-fidelity single-page saves and base64 bundling.
    • SingleFile already supports scoped recursive crawls; some suggest borrowing its techniques for complex features (shadow DOMs, iframes, websockets, deduping).
  • Also mentioned alongside grab-site, browsertrix, Kiwix/ZIM, gwtar, mhtml, Teleport Pro, curl, and “Print to PDF.”

Output formats & distribution

  • Currently outputs:
    • Static HTML folder.
    • ZIM files (for Kiwix readers on multiple platforms).
    • Self-contained executables that embed a webview and archive.
  • Suggestions:
    • Single self-contained HTML with client-side routing.
    • Self-extracting archive that opens in the user’s browser (CHM-like).
    • Markdown folder for use with tools like Obsidian/Logseq; EPUB export.

Use cases

  • Offline reading of blogs, docs, company wikis, Confluence, and dev docs (e.g., Apple docs, Snowflake).
  • Travel/offline scenarios (flights, trains, sites without connectivity).
  • Long-term personal archives and research access.

Technical behavior & limitations

  • Uses Chrome/Chromium; Docker recommended for easier setup.
  • Current gaps/requests: throttling to reduce site load, excluding images/videos, partial crawls, cookie/auth support, handling paywalls, and embedded videos.
  • Serving via a local HTTP server avoids file:// CORS/origin quirks; some debate about how restrictive browsers are in this mode.

Security & trust concerns

  • Use of --no-sandbox with Chrome in Docker raised; later constrained to Docker via env/flags.
  • Some distrust storing archives as binaries; preference for HTML/ZIM for sharing.
  • Antivirus false positive reported for a related binary.
  • A few readers strongly dislike the README style, calling it “LLM slop,” and question whether code quality matches.

Archival ecosystem & long-term concerns

  • Discussion of WARC vs mitmproxy captures, HTTP/2/WebSockets, and potential new archival formats.
  • Interest in compatibility with established formats (WARC, ZIM) and questions about how “keep it for a decade” holds up over time (unclear).