Anyone can access deleted and private repository data on GitHub
Scope of the Behavior
- Not new: multiple commenters say they reported or noticed this behavior years ago; GitHub classified it as “known, low risk” and documented it later.
- Affects GitHub fork networks: objects (commits/blobs) are shared across forks; refs are per-fork.
- Key edge cases discussed:
- Deleted forks of public repos: commits remain reachable via the upstream by hash.
- Private forks of private repos that later become public: pre‑existing fork commits can be accessed via the now‑public upstream, even if forks stay private or are deleted.
- Purely private repos and forks that never become public appear unaffected.
Mechanics: Why Data Sticks Around
- Git’s content‑addressable storage and GitHub’s shared storage for forks mean objects are retained until garbage‑collected.
- Several users assert GitHub effectively never GC’s unreachable commits within fork networks.
- Short SHA support (down to 4 hex chars) makes brute forcing commit IDs feasible.
- Public events archives (including third‑party GH event mirrors) leak commit hashes, enabling targeted retrieval.
How Serious Is It?
- One camp: only the “private fork becomes de facto public when upstream is made public” is a real vulnerability; everything that was ever public should be assumed permanently public anyway.
- Other camp: this is a major Principle of Least Astonishment violation; “private” and “delete” strongly imply stronger guarantees than users actually get.
- Several note real‑world impact: leaked API keys, proprietary algorithms, console SDKs, and internal forks unknowingly exposed.
GitHub’s Handling and Bug Bounties
- Multiple reports through HackerOne were closed as “working as intended,” no bounty.
- Debate over bug bounty ethics: companies don’t pay for known or architectural issues; researchers argue this discourages reporting systemic problems.
- Some see GitHub’s documentation as insufficient UX; burying a surprising security property in help docs is viewed as user‑hostile.
Mitigations and Alternatives
- Practical advice from the thread:
- Don’t use GitHub “private forks” for sensitive work; make a separate private repo (clone/template) instead.
- Never open‑source an existing private repo; create a fresh public repo and copy selected code.
- Always rotate secrets once committed, regardless of later deletion.
- For stricter control, consider other hosts or self‑hosted Git (GitLab, Gitea, etc.), though similar patterns may exist elsewhere.
- Some note GitHub offers a manual path for full data removal via support for legal/privacy reasons, but this is not automatic.