2025-06-06

How we’re responding to The NYT’s data demands in order to protect user privacy

Scope and purpose of the court order

Many commenters see the order to preserve all ChatGPT logs (including deleted and “temporary” chats) as standard US evidence-preservation practice in a copyright case: NYT wants to quantify how often verbatim or near-verbatim NYT text is generated to calculate damages.
Others argue this goes far beyond normal proportionality, sweeps in huge amounts of unrelated, highly personal data from uninvolved users, and sets a bad precedent for privacy-focused services.

Privacy, logging, and “legal hold”

Strong skepticism that OpenAI meaningfully protects privacy: users assume everything sent to a hosted API is logged indefinitely, regardless of marketing claims or toggles.
Several point out that a “legal hold” is just a preservation requirement; it does not legally block OpenAI from using or accessing the data for other purposes unless other policies/laws do.
Some say data is a “toxic asset” and the only secure option is not retaining it at all; being forced to keep it inherently increases risk.

Zero Data Retention (ZDR) and product behavior

Commenters note ZDR APIs exist but are hard to actually obtain; requests are allegedly ignored, leading to accusations that ZDR is more marketing than reality.
OpenAI’s own post says ZDR API endpoints and Enterprise are excluded from the order, but people question why privacy is a paid/approved feature rather than a universal option.
There is confusion and criticism around the in-app “Improve the model for everyone” toggle versus the separate privacy portal, seen by some as a dark pattern.

GDPR and non-US users

Debate over whether complying with the US order violates GDPR:
- Some say GDPR has allowances for court-ordered retention and it’s only a problem if data is kept beyond the case.
- Others cite GDPR limits on honoring third-country orders without specific agreements and argue an EU court might bar such retention for EU residents.

NYT vs OpenAI copyright dispute

Several think NYT’s underlying claim is strong, pointing to examples where ChatGPT allegedly regurgitates NYT text and arguing per-infringement damages justify broad discovery.
Others view OpenAI’s training as fair use and call NYT’s demand overbroad or abusive of US discovery rules.
OpenAI’s public framing of the lawsuit as “baseless” and as a privacy attack is widely characterized as spin; critics say OpenAI’s own copyright decisions created this situation.

Government and surveillance concerns

A long subthread debates whether US intelligence agencies likely access such data:
- Some assert it’s almost certainly tapped and easily searchable using modern methods.
- Others call this unfalsifiable conspiracy thinking, noting legal and technical barriers, but still concede metadata alone is highly revealing.

Sensitivity of LLM chat histories

Many emphasize that LLM conversations can be more revealing than browser history: people use them for emotional processing, relationship issues, work drafts, and “raw” inner thoughts, making the retention order feel especially invasive.

Related topics