Ministry of Justice orders deletion of the UK's largest court reporting database
Role and Value of Courtdesk
- Service provided near‑real‑time streams of court listings and events (claims of ~12,000 updates/day), filtered and searchable.
- Commenters say underlying data is technically public but effectively “hidden”: you must already know a case exists or navigate clunky systems (e.g. legacy Windows apps).
- Courtdesk’s aggregation was seen as crucial for:
- Journalists to discover cases in time to attend.
- Research and statistics on charging, sentencing, and “weekend” cases with no press presence.
- Several see shutting it down as materially reducing practical transparency, even if the “source of truth” remains elsewhere.
Government Rationale vs Company Rebuttal
- Official line: Courtdesk breached conditions by sharing sensitive personal data on ~700+ cases with an AI company, contrary to its agreement.
- Company response (as summarized in comments): they hired a specialist ML contractor under a sub‑processor agreement to build a “sandboxed” safety tool; no resale, no OpenAI-style ingestion, money flowed from Courtdesk to contractor.
- Dispute over whether this counts as “sharing with a third party” or normal outsourcing, and whether the government has mischaracterized events.
- Some note the issue was not referred to the data regulator, which they find suspicious.
Transparency, Politics, and “Cover‑Up” Claims
- A segment of commenters connects the deletion order to broader worries about:
- Grooming gang scandals and alleged past cover‑ups.
- Immigration and crime debates.
- Upcoming or sensitive trials (including those involving senior politicians).
- Others push back, calling this opportunistic use of anti‑immigrant sentiment and stressing that similar child‑protection failures occurred irrespective of ethnicity.
- There is disagreement whether this is bureaucratic risk‑aversion, contract enforcement, or an intentional attempt to reduce scrutiny of the justice system.
Public Records, Privacy, and AI
- Big split over principle:
- One side: if it’s public record it should be cheaply, digitally, and bulk‑accessibly public; AI scraping is just a fact of life.
- Other side: “publicly accessible” ≠ “free to mass‑harvest, republish, and monetize indefinitely,” especially for minors, acquitted defendants, and expunged cases.
- Fears that AI corpora will create “forever convictions” and make rehabilitation impossible; others argue that past crime is legitimately relevant information.
- Many suggest middle‑ground models:
- Redacting PII in bulk datasets, but allowing detailed access under tighter controls.
- Certificates or filtered checks (e.g. “fit to work with children/finance”) instead of raw criminal histories.
- Maintaining friction (in‑person requests, rate limits, or logged access) to prevent industrial scraping while preserving open justice.
Technical and Structural Issues
- Recognition that ease of aggregation fundamentally changes the impact of “open” data; bots can do in hours what no human could in a lifetime.
- Debate over whether paywalls, rate‑limits, or robots.txt are legitimate tools to curb abuse or just pseudo‑openness.
- Some argue the government should run a modern, well‑documented API or at least a torrentable archive; others think restricting machine access is appropriate.
Legal/Contractual Framing and Next Steps
- Some frame this primarily as a straightforward breach‑of‑contract/data‑protection issue: conditions explicitly limited onward sharing and non‑journalist uses.
- Others think the punishment (full shutdown and deletion of historical archive) is disproportionate and harms public oversight more than it protects data subjects.
- Hints that the Ministry intends a new licensing framework or replacement system, but commenters are skeptical it will match Courtdesk’s utility.
- A few propose offshoring mirrors (e.g. US‑hosted, torrent archives) to place court data beyond UK government takedown reach.