2025-03-28

We hacked Gemini's Python sandbox and leaked its source code (at least some)

Scope of the “Hack” and Title Controversy

Many commenters argue the title (“hacked Gemini and leaked its source”) is misleading or clickbait.
They stress this was about the Python sandbox infrastructure, not the Gemini model or its training data.
Some say running strings on a binary and exploring a container is routine reverse‑engineering, not a major “hack.”

What Was Actually Exposed

The main “leak” was internal protobuf definitions bundled into the sandbox binary by an automated build step.
Debate on sensitivity:
- Some say proto definitions are like a schema and not inherently secret, with similar files already leaked years ago.
- Others note these particular protos touch internal authn/authz and data-classification systems, so their structure could aid attackers or reveal architecture.
No model weights, training corpus, or broader internal systems were accessed.

Sandbox Architecture and Creation

The sandbox runs in gVisor; Google engineer confirms they use checkpoint/restore plus a CoW overlay filesystem for very fast startup.
Commenters compare this to alternative approaches (ZFS or LVM snapshots, unikernels), discussing copy‑on‑write performance and caching benefits.
The same engineer says the sandbox is general-purpose for running untrusted code (data analysis, extensions), not just a one-off feature.

Security Posture and Significance

Several people view this as a minor but valid issue that mainly exposes a gap in security review and build automation.
Others argue the incident shows Google’s overall robustness: the sandbox largely did what it should, and the work was done in collaboration with Google’s security team.

Prompt Injection and Agent Security

One subthread uses this as a springboard to discuss how local/agentic AIs will face prompt-injection risks when browsing the web.
Comparison is made to humans getting “mind‑viruses” from internet content; concern that future personal agents could be subverted the same way.

Gemini, Assistant, and Product Perception

Long side discussion about Gemini replacing Assistant:
- Some users report Gemini can’t reliably set timers, play music, or integrate with device apps; others say it works fine for them.
- Complaints about declining Google UX, “overhyped” AI, and underwhelming product execution despite strong research.
A Googler describes internal mood as a mix of frustration over slow launches, excitement about strong models, and indifference from those who see LLMs as overhyped.
Several commenters claim Gemini models (e.g., Flash, 2.5 Pro, Gemma) are highly capable and cost-effective for developers, despite weaker consumer perception.

Documentation, Transparency, and Developer Experience

Parallel is drawn to scraping ChatGPT Code Interpreter’s environment to discover available packages; people lament that such basic capability lists aren’t officially documented.
One Googler says they’ll raise the idea internally, reinforcing that missing documentation is more likely neglect than deliberate secrecy.

Related topics