1 Key Risks
The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Risk concentrates at the data-ingestion boundary where untrusted workspace content flows into RAG context without sufficient isolation from the inference path.
2 AIRQ Scores
The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Slack AI scores 1.59 reflecting constrained blast radius and moderate defense investment against a demonstrated input-processing attack surface.
The agent lands in Tight Operators where limited autonomous capabilities and absent code-execution surfaces keep blast radius below the threshold despite moderate attack-surface exposure; X = 4.80 places this agent borderline to Tight Operators.
Four metrics summarize the risk-adjusted capability profile for this agent on its documented default configuration.
| Metric | Score | Comments |
|---|---|---|
| AIRQ Score | 2.13 | Low overall risk-adjusted capability reflecting constrained blast radius against moderate attack surface. |
| Blast Radius | 2.13 / 10 | Limited damage potential constrained by absent code execution and deployment capabilities. |
| Attack Surface | 4.8 / 10 | Moderate exposure driven by demonstrated indirect prompt injection via workspace message ingestion. |
| Defense Controls | 7 / 15 | Partial defense coverage with execution isolation at moderate strength but input guardrails historically bypassed. |
3 Attack Surface
Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. Attack surface concentrates at the data-ingestion boundary where workspace messages and uploaded documents flow into RAG context without per-message isolation from the inference path.
Scores reflect the documented default configuration with penalties applied only where agent-specific exploitation was demonstrated.
Ten canonical surfaces scored from the vendor documentation and the PromptArmor disclosure evidence base.
| Surface | Score | Comments |
|---|---|---|
| User Input | 3 / 4 | Attacker-crafted messages in accessible channels were followed as instructions by Slack AI when answering victim queries [1][2]. |
| External Data | 4 / 4 | Public channel messages, uploaded PDFs, and connected third-party data all serve as RAG context with the document ingestion expansion increasing surface days before exploitation [1][4]. |
| Memory | 1 / 4 | Stateless RAG architecture with no cross-session persistence; each AI request is independent per vendor documentation [6]. |
| Reasoning | 2 / 4 | Third-party LLMs process queries inside a sealed network boundary with no chain-of-thought exposure documented [3][6][8]. |
| Planning | 1 / 4 | No autonomous planning capability; the agent responds to individual queries without multi-step goal decomposition [6]. |
| Tool Execution | 1 / 4 | No shell access, code execution sandbox, or browser automation; operates exclusively through Slack platform API calls [6]. |
| Orchestration | 1 / 4 | Single-agent architecture with no subagent delegation or background autonomous tasks per vendor documentation [8]. |
| Inter-Agent | 1 / 4 | MCP server exposes workspace data to external AI assistants but requires OAuth authentication and admin approval [8]. |
| Output Processing | 3 / 4 | AI responses rendered clickable hyperlinks exploited pre-patch for data exfiltration via markdown link injection with query-string encoding [1][10]. |
| Configuration | 1 / 4 | Admin-managed feature toggles with per-channel restrictions; no user-accessible configuration surface altering the security boundary [9]. |
The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. All three conditions are triggered because workspace messages carry attacker-controlled bytes into RAG context that processes private data and renders clickable egress links.
Slack AI exhibits all three of these conditions in its documented default configuration:
- Untrusted input — Any workspace member can post messages retrieved as RAG context for other users' queries without sanitization [2].
- Sensitive data — Slack AI reads private channel messages, DMs, uploaded files, and connected third-party data containing credentials and proprietary content [1].
- External egress — Rendered hyperlinks in AI responses provided the exfiltration channel exploited pre-patch; URL filtering mitigates but does not eliminate the path [1].
4 Blast Radius
The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. Blast radius remains low because Slack AI lacks code execution, file system access, and deployment capabilities, confining damage to data exposure within the workspace boundary.
Scores reflect the maximum damage achievable through the documented default capabilities without requiring additional privilege escalation.
Six canonical factors scored against the agent's documented operational capabilities and demonstrated exploitation outcomes.
| Factor | Score | Comments |
|---|---|---|
| Code execution | 0 / 4 | No code execution capability exists in any configuration; the agent operates entirely through Slack platform API calls [6]. |
| File system access | 0 / 4 | No file system access exists; the agent cannot read from or write to any local or remote file system outside the platform data model [6]. |
| Network access | 2 / 4 | User clicks on AI-generated links enable data exfiltration to external servers as demonstrated pre-patch; the AI processing itself makes no outbound requests [6]. |
| Credential access | 2 / 4 | Workspace content accessed through RAG may contain API keys or credentials shared in messages but no direct credential-store or vault access exists [2]. |
| Autonomous action | 1 / 4 | Limited to sending messages and editing canvases within Slack; no autonomous actions fire without prior user query or workflow trigger [9]. |
| Deployment access | 0 / 4 | No deployment capability exists; the agent cannot modify infrastructure or trigger CI/CD pipelines [8]. |
5 Defense Controls
Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. Defense controls provide moderate structural protection through execution isolation and access controls but input and output guardrails carry demonstrated bypass history reducing confidence.
Scores reflect vendor-documented controls on the default configuration with confidence reduced where independent verification is absent.
Five canonical defense components scored against documented controls, compliance certifications [7], and known bypass history.
| Component | Score | Comments |
|---|---|---|
| Input Guardrails | 1 / 3 | Content thresholds and context engineering documented but bypassed by PromptArmor pre-patch; post-patch effectiveness not independently verified [5]. |
| Execution Isolation | 2 / 3 | All AI processing within Slack's secure cloud infrastructure with LLMs hosted inside the trust boundary without outbound network access [6]. |
| Action Controls | 2 / 3 | Admin controls restrict AI features per-user-group and per-channel on Enterprise plans with role-based access enforcement [9]. |
| Output Guardrails | 1 / 3 | URL filtering deployed post-patch to block phishing URLs in AI responses and native DLP scans AI-generated messages for configured patterns [1][5]. |
| Monitoring | 1 / 3 | Audit Logs API on Enterprise Grid with SIEM integration and Anomaly Event Response for automated interventions [11]. |
6 Hardening Tips
Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators can reduce residual risk by restricting the RAG retrieval surface and configuring active monitoring on Enterprise plans.
Input Guardrails
Input guardrails intercept adversarial content before it reaches the reasoning loop.
- Policy Establish a workspace policy requiring security-sensitive information to be shared only in channels excluded from AI access to reduce the indirect prompt injection retrieval surface.
- Configuration Restrict AI access to channels containing credentials or proprietary data using Enterprise+ per-channel AI restrictions to limit the RAG context boundary.
- Engineering Deploy third-party prompt injection detection tooling at the workspace integration layer to supplement native content thresholds with additional pattern matching.
Execution Isolation
Execution isolation contains what a compromised agent can do on the host.
- Policy Maintain an organizational policy prohibiting custom model endpoints or third-party LLM integrations to preserve the closed trust boundary architecture.
- Configuration Review and restrict MCP server OAuth application approvals to ensure only vetted external AI assistants receive workspace data access.
- Engineering Implement OAuth scope restrictions on MCP server application registrations to enforce least-privilege data access for approved external AI integrations.
Action Controls
Action controls govern which tools and actions the agent can invoke autonomously.
- Policy Require approval workflows for AI-driven canvas creation in channels connected to external services or automation triggers to prevent unintended downstream actions.
- Configuration Configure per-channel AI restrictions on Enterprise+ to prevent AI-generated content in channels where automated message sending could trigger downstream integrations.
- Engineering Implement Workflow Builder guardrails that require human approval steps before AI-generated outputs feed into automated workflow sequences with external side effects.
Output Guardrails
Output guardrails inspect what the agent sends to other systems and users.
- Policy Train users to verify AI-generated links before clicking, particularly when responses reference private channel content or display unfamiliar URL patterns.
- Configuration Enable and configure DLP scanning rules to detect potential data exfiltration patterns in AI-generated messages including encoded content in URL parameters.
- Engineering Implement workspace-level URL parameter stripping on AI-generated links to remove query-string data that could encode exfiltrated content before rendering to users.
Monitoring
Monitoring captures what the agent did and surfaces anomalies for review.
- Policy Establish weekly review of DLP violation summaries filtered for URL-encoded content patterns in AI-generated messages to detect query-string exfiltration attempts.
- Configuration Integrate Slack Audit Logs API with organizational SIEM to correlate AI feature usage patterns with anomaly detection baselines for the workspace.
- Engineering Configure Anomaly Event Response alerts on Enterprise Grid to trigger on unusual AI query patterns that may indicate prompt injection reconnaissance or data harvesting.
7 References
The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.
Selected Vulnerabilities
- Slack Security Update (August 2024) Vendor acknowledgment and patch for the indirect prompt injection vulnerability reported by PromptArmor deployed August 20 2024
Selected Research
- Data Exfiltration from Slack AI via Indirect Prompt Injection PromptArmor demonstrated that attacker-authored messages in public channels could cause Slack AI to exfiltrate private channel data through rendered markdown links
- Architectural Implications of Slack AI Prompt Injection Simon Willison analyzed the architectural implications of the PromptArmor disclosure for RAG systems processing untrusted workspace content
- Slack Patches AI Bug That Exposed Private Channels Dark Reading reported the patch deployment and noted the expanded attack surface from the August 14 document ingestion change
Vendor Documentation
- Security for AI features in Slack Vendor help article documenting Slack AI Guardrails including content thresholds and context engineering and URL filtering and output validation
- How we built Slack AI to be secure and private Engineering blog describing RAG architecture and off-the-shelf LLM hosting within trust boundary and stateless processing
- Slack Trust Center Security Compliance certifications grounding the defense-controls confidence markers including ISO 27001 and ISO 42001 and SOC 2 Type II and FedRAMP Moderate
- Securing the Agentic Enterprise Blog post describing trust boundary enforcement and zero training guarantee and permission inheritance and stateless processing and MCP server security model
- Manage access to AI features in Slack Help article documenting admin controls for per-user and per-group and per-role AI feature access management across plan tiers
Other Sources
- Slack Patches Prompt Injection Flaw in AI Tool Set BankInfoSecurity coverage documenting that Slack initially characterized the PromptArmor finding as intended behavior before deploying the patch
- Building Slack Anomaly Event Response Engineering blog describing the Anomaly Event Response system launched February 2025 for automated real-time security interventions on Enterprise Grid