1 Key Risks
The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Notion AI presents demonstrated input injection and output exfiltration risks on its default configuration, with defense controls limited by progressive tool allowlisting and opt-in output filtering.
2 AIRQ Scores
The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Notion AI scores as a moderate-risk agent whose constrained blast radius offsets its demonstrated attack surface vulnerabilities.
Notion AI places in the Exposed Giants quadrant with attack surface at 6.66, blast radius at 2.88, and defense controls at 6 — operators deploying the agent on sensitive workspace content should prioritize closing the demonstrated exfiltration channels before expanding agent access scope.
Each axis measures a distinct risk dimension: attack surface out of 10, blast radius out of 10, defense controls out of 15, and AIRQ composite out of 15.
| Metric | Score | Comments |
|---|---|---|
| AIRQ Score | 2.64 | Low composite indicates the operator's hardening priority is reducing demonstrated attack surface exposure rather than containing blast. |
| Blast Radius | 2.88 / 10 | No code execution or file system access; blast concentrates in autonomous actions and network egress via web search. |
| Attack Surface | 6.66 / 10 | Three surfaces carry demonstrated exploitation penalties; all three trifecta dimensions are triggered. |
| Defense Controls | 6 / 15 | Permission model and audit logging are documented; input filtering and output guardrails lack independent verification. |
3 Attack Surface
Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. Notion AI's reasoning loop accepts first-class input from workspace content, uploaded documents, connected app data, MCP server outputs, and web search results.
Higher scores indicate surfaces where attacker-controlled content reaches the reasoning loop with fewer validation gates between ingestion and action.
Each row maps one attack surface to its adjusted score and a comment citing the evidence that grounds the assessment.
| Surface | Score | Comments |
|---|---|---|
| User Input | 4 / 4 | Multiple input channels accept attacker-controlled content; PromptArmor demonstrated indirect prompt injection via uploaded PDF causing data exfiltration. [1][8] |
| External Data | 4 / 4 | Ingests content from AI Connectors, MCP servers, and web search; CodeIntegrity demonstrated hidden PDF text injection exfiltrating private pages. [2][3] |
| Memory | 3 / 4 | Workspace pages and databases persist as cross-session memory with embeddings in Turbopuffer; no integrity verification on content feeding agent context. [7][13] |
| Reasoning | 3 / 4 | Model-agnostic architecture delegates reasoning to interchangeable external LLMs; vendor acknowledges prompt injection remains an unsolved problem. [6][8][9] |
| Planning | 3 / 4 | Custom Agents perform autonomous task decomposition with trigger-based scheduling and multi-step execution exceeding 20 minutes. [11][14] |
| Tool Execution | 3 / 4 | No shell or code execution; tools include Notion API operations and web search; CodeIntegrity demonstrated web search tool abuse for data exfiltration. [2][8] |
| Orchestration | 3 / 4 | Custom Agents execute autonomously on scheduled, event-driven, and Slack-triggered workflows without continuous operator supervision. [11][14] |
| Inter-Agent | 3 / 4 | Connects to external MCP servers and the External Agents API without documented inter-agent message authentication or integrity verification. [12][10] |
| Output Processing | 4 / 4 | Rich markdown output with demonstrated pre-approval data exfiltration via markdown image URL construction; external link confirmation added post-remediation. [1][8] |
| Configuration | 3 / 4 | MCP servers require admin enablement; the official notion-mcp-server has a confirmed path traversal vulnerability in its file upload handler. [4] |
The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Notion AI reads untrusted workspace content and connected app data, accesses proprietary documents and credentials, and sends bytes externally via web search, MCP, and Slack integration.
Notion AI exhibits all three of these conditions in its documented default configuration:
- Untrusted input — Uploaded PDFs, shared workspace pages, and AI Connector content from Slack and GitHub pull untrusted bytes into the reasoning loop — restrict AI Connector scope and scan uploaded documents before granting agent access. [2][5]
- Sensitive data — Workspace documents, database records, and connected app content routinely contain proprietary data, API keys, and internal communications. [15][7]
- External egress — Web search, MCP tool calls, markdown image rendering, and Slack message sending provide default channels to exfiltrate data externally. [1][2]
4 Blast Radius
The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. A compromised Notion AI agent reaches workspace content and connected app data but lacks code execution, file system access, or deployment capabilities.
Higher blast scores indicate factors where the agent's compromise reaches further into the operator's infrastructure or data.
Each row connects a blast radius factor to the damage scope an attacker reaches if the agent is compromised.
| Factor | Score | Comments |
|---|---|---|
| Code execution | 0 / 4 | No shell, code interpreter, or browser execution capability; the agent operates through Notion API operations and MCP tool calls only. [11] |
| File system access | 0 / 4 | Cloud-hosted SaaS with no access to the operator's file system; workspace content is managed through Notion's API, not file operations. [7] |
| Network access | 2 / 4 | Outbound access mediated through web search, MCP connections, and Slack integration; web search demonstrated as a data exfiltration channel. [2] |
| Credential access | 2 / 4 | Workspace content and connected app data may contain credentials; AI Connectors access GitHub code and Slack messages carrying sensitive tokens. [15] |
| Autonomous action | 3 / 4 | No documented single-button emergency stop for all running agents; Custom Agents fire on triggers and schedules without per-action approval, and MCP write tools are configurable to always-allow. [11][14] |
| Deployment access | 0 / 4 | No deployment, infrastructure modification, or package publishing capability on the default configuration. [11] |
5 Defense Controls
Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. Notion AI documents a permission model and audit logging but leaves input filtering and output guardrails at the vendor-documented tier without independent verification.
Higher defense scores indicate stronger vendor-implemented safeguards that reduce the operator's hardening burden on the default configuration.
Each component scores vendor-implemented controls against the default configuration, with confidence reflecting the evidence tier.
| Component | Score | Comments |
|---|---|---|
| Input Guardrails | 1 / 3 | Vendor documents prompt injection detection with layered controls but acknowledges the problem is unsolved; no independent adversarial testing published. [8] |
| Execution Isolation | 1 / 3 | Cloud-hosted SaaS with SOC 2 Type 2 certified infrastructure; no agent-specific runtime isolation documentation beyond the workspace permission model. [9] |
| Action Controls | 1 / 3 | Build-from-nothing permission model with page-level granularity; capped by progressive allowlisting via per-MCP-tool always-allow with no expiration. [10][9] |
| Output Guardrails | 1 / 3 | External link confirmation added post-remediation; DLP available on Enterprise plan only as a third-party add-on, not enabled by default. [7][1] |
| Monitoring | 2 / 3 | Comprehensive audit log with SIEM integration on Enterprise plan and AI analytics for Custom Agents; SOC 2 Type 2 certified. [7][13] |
6 Hardening Tips
Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators should prioritize closing demonstrated exfiltration channels and restricting autonomous action scope before expanding agent access to sensitive workspace content.
Input Guardrails
Input guardrails intercept adversarial content before it reaches the reasoning loop.
- Policy Require workspace admins to review and approve all external content sources before granting agent access to minimize untrusted input channels.
- Configuration Restrict AI Connector access to specific Slack channels and GitHub repositories rather than granting broad organizational access.
- Engineering Deploy a prompt injection detection proxy between uploaded documents and the agent context window to filter adversarial payloads.
Execution Isolation
Execution isolation contains what a compromised agent can do on the host.
- Policy Mandate that Custom Agents operate only within designated workspace partitions to contain the scope of any compromised agent session.
- Configuration Limit each Custom Agent's page-level access grants to the minimum set required for its specific workflow.
- Engineering Enforce least-privilege network policies on Custom Agent backend endpoints to restrict lateral movement if the cloud infrastructure is compromised.
Action Controls
Action controls govern which tools and actions the agent can invoke autonomously.
- Policy Require periodic review of MCP tool always-allow settings and revoke permanent exemptions on a quarterly schedule.
- Configuration Disable the always-allow option for MCP write tools and require per-invocation confirmation for external system modifications.
- Engineering Build an automated audit pipeline that flags Custom Agents with always-allow settings on write-capable MCP tools.
Output Guardrails
Output guardrails inspect what the agent sends to other systems and users.
- Policy Require Enterprise plan DLP integration for all workspaces deploying Custom Agents with access to sensitive data.
- Configuration Enable the strictest external link handling mode and configure DLP policies to cover AI prompts and agent-generated content.
- Engineering Deploy an egress proxy that inspects markdown image URLs for encoded workspace data and web search queries for attacker-controlled domain parameters, targeting the two demonstrated exfiltration vectors. [1][2]
Monitoring
Monitoring captures what the agent did and surfaces anomalies for review.
- Policy Require SIEM integration for all workspaces running Custom Agents and establish alerting rules for anomalous agent activity patterns.
- Configuration Enable audit log forwarding to your SIEM for all Custom Agent events and configure retention exceeding regulatory minimums.
- Engineering Build behavioral anomaly detection rules flagging agents reading more workspace content than their declared scope requires.
7 References
The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.
Selected Vulnerabilities
- PromptArmor — Notion AI data exfiltration via markdown image rendering Demonstrated pre-approval exfiltration via PDF injection; remediated Jan 2026
- CodeIntegrity — Notion 3.0 web search tool data exfiltration Demonstrated exfiltration via web search URL queries from hidden PDF text
- notion-mcp-server #238 — prompt injection via shared page content Confused deputy attack via unsanitized page content; open
- notion-mcp-server #237 — path traversal in file upload (CWE-22) CVSS 7.7 est.; arbitrary file read via unsanitized paths; open
Selected Research
- Simon Willison — Notion 3.0 data exfiltration vulnerability analysis Amplified CodeIntegrity research; named trifecta conditions explicitly
- Bruce Schneier — Notion AI data theft via prompt injection Covered CodeIntegrity research; noted LLM instruction/data confusion
Vendor Documentation
- Notion AI security and privacy practices Documents zero retention, Turbopuffer embedding storage, DLP availability
- Notion prompt injection protection documentation Documents detection layers, model picker warnings, admin controls
- How we built security into Custom Agents Engineering blog on permission model, multilayer injection approach
- Custom Agents security features Documents independent permissions, admin Agent Directory controls
- Custom Agents documentation Documents triggers, schedules, model selection, Tools and Access config
- MCP connections for Custom Agents Documents external MCP server connections and tool confirmation settings
- Enterprise Search security and privacy practices Documents embedding persistence, permission validation, anomaly detection
Other Sources
- Notion 3.0 release notes Documents AI Agent launch, 20+ minute multi-step autonomous actions
- Notion AI Connectors documentation Lists available connectors: Slack, GitHub, Jira, Drive, Teams, Mail