1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Risk concentrates at the data-ingestion boundary where untrusted workspace content flows into RAG context without sufficient isolation from the inference path.

Key Input Risks

Workspace messages from any member and uploaded documents serve as RAG context, creating an indirect prompt injection vector confirmed by the August 2024 PromptArmor disclosure [1][2]. The retrieval boundary spans all channels accessible to the querying user without per-message sanitization.

Key Execution Risks

Slack AI performs stateless LLM inference within a closed trust boundary with no shell, code sandbox, or browser execution surface [6]. No independent red-team verification of the isolation boundary has been published.

Key Action Risks

The agent sends messages and edits canvases without per-action operator approval on the default configuration for users with AI feature access [9]. Channel-level restriction requires Enterprise+ and explicit admin opt-in.

Key Output Risks

AI responses render clickable hyperlinks that enabled pre-patch data exfiltration through markdown link injection with encoded query-string parameters [1][5]. URL filtering now blocks known-bad patterns but the rendered-link channel persists architecturally.

Key Monitoring Risks

Enterprise Grid provides audit logs with SIEM forwarding and the Anomaly Event Response system for automated real-time interventions [11]. Lower-tier plans lack audit-log API access and DLP rule configuration is admin-initiated not default.

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Slack AI scores 1.59 reflecting constrained blast radius and moderate defense investment against a demonstrated input-processing attack surface.

AIRQ Metrics

AIRQ Score2.13

Blast Radius2.13

Attack Surface4.8

Defense Controls7

The agent lands in Tight Operators where limited autonomous capabilities and absent code-execution surfaces keep blast radius below the threshold despite moderate attack-surface exposure; X = 4.80 places this agent borderline to Tight Operators.

Four metrics summarize the risk-adjusted capability profile for this agent on its documented default configuration.

Metric	Score	Comments
AIRQ Score	2.13	Low overall risk-adjusted capability reflecting constrained blast radius against moderate attack surface.
Blast Radius	2.13 / 10	Limited damage potential constrained by absent code execution and deployment capabilities.
Attack Surface	4.8 / 10	Moderate exposure driven by demonstrated indirect prompt injection via workspace message ingestion.
Defense Controls	7 / 15	Partial defense coverage with execution isolation at moderate strength but input guardrails historically bypassed.

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. Attack surface concentrates at the data-ingestion boundary where workspace messages and uploaded documents flow into RAG context without per-message isolation from the inference path.

Attack Surface Metrics

User Input3

Tool Execution1

External Data4

Orchestration1

Memory1

Inter-Agent1

Reasoning2

Output Processing3

Planning1

Configuration1

Scores reflect the documented default configuration with penalties applied only where agent-specific exploitation was demonstrated.

Ten canonical surfaces scored from the vendor documentation and the PromptArmor disclosure evidence base.

Surface	Score	Comments
User Input	3 / 4	Attacker-crafted messages in accessible channels were followed as instructions by Slack AI when answering victim queries [1][2].
External Data	4 / 4	Public channel messages, uploaded PDFs, and connected third-party data all serve as RAG context with the document ingestion expansion increasing surface days before exploitation [1][4].
Memory	1 / 4	Stateless RAG architecture with no cross-session persistence; each AI request is independent per vendor documentation [6].
Reasoning	2 / 4	Third-party LLMs process queries inside a sealed network boundary with no chain-of-thought exposure documented [3][6][8].
Planning	1 / 4	No autonomous planning capability; the agent responds to individual queries without multi-step goal decomposition [6].
Tool Execution	1 / 4	No shell access, code execution sandbox, or browser automation; operates exclusively through Slack platform API calls [6].
Orchestration	1 / 4	Single-agent architecture with no subagent delegation or background autonomous tasks per vendor documentation [8].
Inter-Agent	1 / 4	MCP server exposes workspace data to external AI assistants but requires OAuth authentication and admin approval [8].
Output Processing	3 / 4	AI responses rendered clickable hyperlinks exploited pre-patch for data exfiltration via markdown link injection with query-string encoding [1][10].
Configuration	1 / 4	Admin-managed feature toggles with per-channel restrictions; no user-accessible configuration surface altering the security boundary [9].

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. All three conditions are triggered because workspace messages carry attacker-controlled bytes into RAG context that processes private data and renders clickable egress links.

Lethal Trifecta · Complete (3 of 3)

Slack AI exhibits all three of these conditions in its documented default configuration:

Untrusted input — Any workspace member can post messages retrieved as RAG context for other users' queries without sanitization [2].
Sensitive data — Slack AI reads private channel messages, DMs, uploaded files, and connected third-party data containing credentials and proprietary content [1].
External egress — Rendered hyperlinks in AI responses provided the exfiltration channel exploited pre-patch; URL filtering mitigates but does not eliminate the path [1].

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. Blast radius remains low because Slack AI lacks code execution, file system access, and deployment capabilities, confining damage to data exposure within the workspace boundary.

Blast Radius Metrics

Code execution0

Credential access2

File system access0

Autonomous action1

Network access2

Deployment access0

Scores reflect the maximum damage achievable through the documented default capabilities without requiring additional privilege escalation.

Six canonical factors scored against the agent's documented operational capabilities and demonstrated exploitation outcomes.

Factor	Score	Comments
Code execution	0 / 4	No code execution capability exists in any configuration; the agent operates entirely through Slack platform API calls [6].
File system access	0 / 4	No file system access exists; the agent cannot read from or write to any local or remote file system outside the platform data model [6].
Network access	2 / 4	User clicks on AI-generated links enable data exfiltration to external servers as demonstrated pre-patch; the AI processing itself makes no outbound requests [6].
Credential access	2 / 4	Workspace content accessed through RAG may contain API keys or credentials shared in messages but no direct credential-store or vault access exists [2].
Autonomous action	1 / 4	Limited to sending messages and editing canvases within Slack; no autonomous actions fire without prior user query or workflow trigger [9].
Deployment access	0 / 4	No deployment capability exists; the agent cannot modify infrastructure or trigger CI/CD pipelines [8].

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. Defense controls provide moderate structural protection through execution isolation and access controls but input and output guardrails carry demonstrated bypass history reducing confidence.

Defense Controls Metrics

Input Guardrails1

Execution Isolation2

Action Controls2

Output Guardrails1

Monitoring1

Scores reflect vendor-documented controls on the default configuration with confidence reduced where independent verification is absent.

Five canonical defense components scored against documented controls, compliance certifications [7], and known bypass history.

Component	Score	Comments
Input Guardrails	1 / 3	Content thresholds and context engineering documented but bypassed by PromptArmor pre-patch; post-patch effectiveness not independently verified [5].
Execution Isolation	2 / 3	All AI processing within Slack's secure cloud infrastructure with LLMs hosted inside the trust boundary without outbound network access [6].
Action Controls	2 / 3	Admin controls restrict AI features per-user-group and per-channel on Enterprise plans with role-based access enforcement [9].
Output Guardrails	1 / 3	URL filtering deployed post-patch to block phishing URLs in AI responses and native DLP scans AI-generated messages for configured patterns [1][5].
Monitoring	1 / 3	Audit Logs API on Enterprise Grid with SIEM integration and Anomaly Event Response for automated interventions [11].

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators can reduce residual risk by restricting the RAG retrieval surface and configuring active monitoring on Enterprise plans.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails

Policy Establish a workspace policy requiring security-sensitive information to be shared only in channels excluded from AI access to reduce the indirect prompt injection retrieval surface.
Configuration Restrict AI access to channels containing credentials or proprietary data using Enterprise+ per-channel AI restrictions to limit the RAG context boundary.
Engineering Deploy third-party prompt injection detection tooling at the workspace integration layer to supplement native content thresholds with additional pattern matching.

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation

Policy Maintain an organizational policy prohibiting custom model endpoints or third-party LLM integrations to preserve the closed trust boundary architecture.
Configuration Review and restrict MCP server OAuth application approvals to ensure only vetted external AI assistants receive workspace data access.
Engineering Implement OAuth scope restrictions on MCP server application registrations to enforce least-privilege data access for approved external AI integrations.

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls

Policy Require approval workflows for AI-driven canvas creation in channels connected to external services or automation triggers to prevent unintended downstream actions.
Configuration Configure per-channel AI restrictions on Enterprise+ to prevent AI-generated content in channels where automated message sending could trigger downstream integrations.
Engineering Implement Workflow Builder guardrails that require human approval steps before AI-generated outputs feed into automated workflow sequences with external side effects.

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails

Policy Train users to verify AI-generated links before clicking, particularly when responses reference private channel content or display unfamiliar URL patterns.
Configuration Enable and configure DLP scanning rules to detect potential data exfiltration patterns in AI-generated messages including encoded content in URL parameters.
Engineering Implement workspace-level URL parameter stripping on AI-generated links to remove query-string data that could encode exfiltrated content before rendering to users.

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring

Policy Establish weekly review of DLP violation summaries filtered for URL-encoded content patterns in AI-generated messages to detect query-string exfiltration attempts.
Configuration Integrate Slack Audit Logs API with organizational SIEM to correlate AI feature usage patterns with anomaly detection baselines for the workspace.
Engineering Configure Anomaly Event Response alerts on Enterprise Grid to trigger on unusual AI query patterns that may indicate prompt injection reconnaissance or data harvesting.

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

Slack Security Update (August 2024) Vendor acknowledgment and patch for the indirect prompt injection vulnerability reported by PromptArmor deployed August 20 2024

Selected Research

Data Exfiltration from Slack AI via Indirect Prompt Injection PromptArmor demonstrated that attacker-authored messages in public channels could cause Slack AI to exfiltrate private channel data through rendered markdown links
Architectural Implications of Slack AI Prompt Injection Simon Willison analyzed the architectural implications of the PromptArmor disclosure for RAG systems processing untrusted workspace content
Slack Patches AI Bug That Exposed Private Channels Dark Reading reported the patch deployment and noted the expanded attack surface from the August 14 document ingestion change

Vendor Documentation

Security for AI features in Slack Vendor help article documenting Slack AI Guardrails including content thresholds and context engineering and URL filtering and output validation
How we built Slack AI to be secure and private Engineering blog describing RAG architecture and off-the-shelf LLM hosting within trust boundary and stateless processing
Slack Trust Center Security Compliance certifications grounding the defense-controls confidence markers including ISO 27001 and ISO 42001 and SOC 2 Type II and FedRAMP Moderate
Securing the Agentic Enterprise Blog post describing trust boundary enforcement and zero training guarantee and permission inheritance and stateless processing and MCP server security model
Manage access to AI features in Slack Help article documenting admin controls for per-user and per-group and per-role AI feature access management across plan tiers

Other Sources

Slack Patches Prompt Injection Flaw in AI Tool Set BankInfoSecurity coverage documenting that Slack initially characterized the PromptArmor finding as intended behavior before deploying the patch
Building Slack Anomaly Event Response Engineering blog describing the Anomaly Event Response system launched February 2025 for automated real-time security interventions on Enterprise Grid