Notion AI Agent Security Risks

Work Copilot Agents notion.com Exposed Giants
AI RISK QUADRANT POSITION DEFENSE CONTROLS (6) ATTACK SURFACE (6.66) EXPOSED GIANTS FORTIFIED LEADERS HUMBLE PROVIDERS TIGHT OPERATORS
AIRQ Score
2.64
Critical
Attack Surface
6.66
High
Blast Radius
2.88
Low
Defense Controls
6
High
About The Agent

Notion AI is a cloud-hosted productivity copilot embedded in the Notion workspace suite, delivering autonomous agent capabilities through Custom Agents that execute trigger-based workflows across workspace content, connected apps, and external MCP servers. The agent operates as a SaaS service with no self-hosted option, ingesting content from multiple channels including uploaded documents, Slack messages, GitHub code, and Jira tickets. Independent security researchers have demonstrated prompt injection and data exfiltration attacks against the agent's default configuration, establishing concrete risk surfaces.

About the AI Risk Quadrant

Exposed Giants reflects Notion AI's combination of moderate attack surface exposure and low blast radius. The attack surface scores above the midpoint at 6.66 driven by demonstrated prompt injection and data exfiltration vulnerabilities across input, output, and tool execution surfaces. Blast radius remains constrained at 2.88 because the agent lacks code execution, file system access, and deployment capabilities. Operators should prioritize closing the demonstrated exfiltration channels and restricting autonomous action scope before expanding agent access.

1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Notion AI presents demonstrated input injection and output exfiltration risks on its default configuration, with defense controls limited by progressive tool allowlisting and opt-in output filtering.

Key Input Risks
Notion AI ingests attacker-controlled content from workspace pages, uploaded PDFs, AI Connectors, MCP server outputs, and web search results on its default configuration. Independent researchers demonstrated indirect prompt injection via hidden PDF text triggering data exfiltration before user approval. [1][5]
Key Execution Risks
The agent executes Notion API operations, web search queries, and MCP tool calls without shell or code execution capability. No independent adversarial testing of the cloud-hosted execution boundary has been published, and agent-specific runtime isolation is not documented beyond workspace permissions. [7][8]
Key Action Risks
Custom Agents fire on scheduled and event-driven triggers without per-action operator approval. Disable the per-MCP-tool always-allow setting for write tools and require per-invocation confirmation to prevent permanent confirmation bypasses from accumulating. [10][11]
Key Output Risks
The agent emits rich markdown including images and links without default DLP or exfiltration channel blocking on the standard configuration. Markdown image URL construction was demonstrated as a pre-approval exfiltration channel reaching external servers before the operator sees the output. [1][7]
Key Monitoring Risks
Audit logging covers workspace events with SIEM integration on Enterprise plans, and AI analytics track Custom Agent activity. Operators on Business plans lack SIEM integration entirely; Enterprise plan operators must explicitly enable audit log forwarding and configure DLP for AI outputs. [7][13]

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Notion AI scores as a moderate-risk agent whose constrained blast radius offsets its demonstrated attack surface vulnerabilities.

AIRQ Metrics

Notion AI places in the Exposed Giants quadrant with attack surface at 6.66, blast radius at 2.88, and defense controls at 6 — operators deploying the agent on sensitive workspace content should prioritize closing the demonstrated exfiltration channels before expanding agent access scope.

Each axis measures a distinct risk dimension: attack surface out of 10, blast radius out of 10, defense controls out of 15, and AIRQ composite out of 15.

Metric Score Comments
AIRQ Score 2.64 Low composite indicates the operator's hardening priority is reducing demonstrated attack surface exposure rather than containing blast.
Blast Radius 2.88 / 10 No code execution or file system access; blast concentrates in autonomous actions and network egress via web search.
Attack Surface 6.66 / 10 Three surfaces carry demonstrated exploitation penalties; all three trifecta dimensions are triggered.
Defense Controls 6 / 15 Permission model and audit logging are documented; input filtering and output guardrails lack independent verification.

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. Notion AI's reasoning loop accepts first-class input from workspace content, uploaded documents, connected app data, MCP server outputs, and web search results.

Attack Surface Metrics

Higher scores indicate surfaces where attacker-controlled content reaches the reasoning loop with fewer validation gates between ingestion and action.

Each row maps one attack surface to its adjusted score and a comment citing the evidence that grounds the assessment.

Surface Score Comments
User Input 4 / 4 Multiple input channels accept attacker-controlled content; PromptArmor demonstrated indirect prompt injection via uploaded PDF causing data exfiltration. [1][8]
External Data 4 / 4 Ingests content from AI Connectors, MCP servers, and web search; CodeIntegrity demonstrated hidden PDF text injection exfiltrating private pages. [2][3]
Memory 3 / 4 Workspace pages and databases persist as cross-session memory with embeddings in Turbopuffer; no integrity verification on content feeding agent context. [7][13]
Reasoning 3 / 4 Model-agnostic architecture delegates reasoning to interchangeable external LLMs; vendor acknowledges prompt injection remains an unsolved problem. [6][8][9]
Planning 3 / 4 Custom Agents perform autonomous task decomposition with trigger-based scheduling and multi-step execution exceeding 20 minutes. [11][14]
Tool Execution 3 / 4 No shell or code execution; tools include Notion API operations and web search; CodeIntegrity demonstrated web search tool abuse for data exfiltration. [2][8]
Orchestration 3 / 4 Custom Agents execute autonomously on scheduled, event-driven, and Slack-triggered workflows without continuous operator supervision. [11][14]
Inter-Agent 3 / 4 Connects to external MCP servers and the External Agents API without documented inter-agent message authentication or integrity verification. [12][10]
Output Processing 4 / 4 Rich markdown output with demonstrated pre-approval data exfiltration via markdown image URL construction; external link confirmation added post-remediation. [1][8]
Configuration 3 / 4 MCP servers require admin enablement; the official notion-mcp-server has a confirmed path traversal vulnerability in its file upload handler. [4]

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Notion AI reads untrusted workspace content and connected app data, accesses proprietary documents and credentials, and sends bytes externally via web search, MCP, and Slack integration.

Lethal Trifecta · Complete (3 of 3)

Notion AI exhibits all three of these conditions in its documented default configuration:

  • Untrusted input — Uploaded PDFs, shared workspace pages, and AI Connector content from Slack and GitHub pull untrusted bytes into the reasoning loop — restrict AI Connector scope and scan uploaded documents before granting agent access. [2][5]
  • Sensitive data — Workspace documents, database records, and connected app content routinely contain proprietary data, API keys, and internal communications. [15][7]
  • External egress — Web search, MCP tool calls, markdown image rendering, and Slack message sending provide default channels to exfiltrate data externally. [1][2]

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. A compromised Notion AI agent reaches workspace content and connected app data but lacks code execution, file system access, or deployment capabilities.

Blast Radius Metrics

Higher blast scores indicate factors where the agent's compromise reaches further into the operator's infrastructure or data.

Each row connects a blast radius factor to the damage scope an attacker reaches if the agent is compromised.

Factor Score Comments
Code execution 0 / 4 No shell, code interpreter, or browser execution capability; the agent operates through Notion API operations and MCP tool calls only. [11]
File system access 0 / 4 Cloud-hosted SaaS with no access to the operator's file system; workspace content is managed through Notion's API, not file operations. [7]
Network access 2 / 4 Outbound access mediated through web search, MCP connections, and Slack integration; web search demonstrated as a data exfiltration channel. [2]
Credential access 2 / 4 Workspace content and connected app data may contain credentials; AI Connectors access GitHub code and Slack messages carrying sensitive tokens. [15]
Autonomous action 3 / 4 No documented single-button emergency stop for all running agents; Custom Agents fire on triggers and schedules without per-action approval, and MCP write tools are configurable to always-allow. [11][14]
Deployment access 0 / 4 No deployment, infrastructure modification, or package publishing capability on the default configuration. [11]

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. Notion AI documents a permission model and audit logging but leaves input filtering and output guardrails at the vendor-documented tier without independent verification.

Defense Controls Metrics

Higher defense scores indicate stronger vendor-implemented safeguards that reduce the operator's hardening burden on the default configuration.

Each component scores vendor-implemented controls against the default configuration, with confidence reflecting the evidence tier.

Component Score Comments
Input Guardrails 1 / 3 Vendor documents prompt injection detection with layered controls but acknowledges the problem is unsolved; no independent adversarial testing published. [8]
Execution Isolation 1 / 3 Cloud-hosted SaaS with SOC 2 Type 2 certified infrastructure; no agent-specific runtime isolation documentation beyond the workspace permission model. [9]
Action Controls 1 / 3 Build-from-nothing permission model with page-level granularity; capped by progressive allowlisting via per-MCP-tool always-allow with no expiration. [10][9]
Output Guardrails 1 / 3 External link confirmation added post-remediation; DLP available on Enterprise plan only as a third-party add-on, not enabled by default. [7][1]
Monitoring 2 / 3 Comprehensive audit log with SIEM integration on Enterprise plan and AI analytics for Custom Agents; SOC 2 Type 2 certified. [7][13]

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators should prioritize closing demonstrated exfiltration channels and restricting autonomous action scope before expanding agent access to sensitive workspace content.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails
  • Policy Require workspace admins to review and approve all external content sources before granting agent access to minimize untrusted input channels.
  • Configuration Restrict AI Connector access to specific Slack channels and GitHub repositories rather than granting broad organizational access.
  • Engineering Deploy a prompt injection detection proxy between uploaded documents and the agent context window to filter adversarial payloads.

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation
  • Policy Mandate that Custom Agents operate only within designated workspace partitions to contain the scope of any compromised agent session.
  • Configuration Limit each Custom Agent's page-level access grants to the minimum set required for its specific workflow.
  • Engineering Enforce least-privilege network policies on Custom Agent backend endpoints to restrict lateral movement if the cloud infrastructure is compromised.

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls
  • Policy Require periodic review of MCP tool always-allow settings and revoke permanent exemptions on a quarterly schedule.
  • Configuration Disable the always-allow option for MCP write tools and require per-invocation confirmation for external system modifications.
  • Engineering Build an automated audit pipeline that flags Custom Agents with always-allow settings on write-capable MCP tools.

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails
  • Policy Require Enterprise plan DLP integration for all workspaces deploying Custom Agents with access to sensitive data.
  • Configuration Enable the strictest external link handling mode and configure DLP policies to cover AI prompts and agent-generated content.
  • Engineering Deploy an egress proxy that inspects markdown image URLs for encoded workspace data and web search queries for attacker-controlled domain parameters, targeting the two demonstrated exfiltration vectors. [1][2]

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring
  • Policy Require SIEM integration for all workspaces running Custom Agents and establish alerting rules for anomalous agent activity patterns.
  • Configuration Enable audit log forwarding to your SIEM for all Custom Agent events and configure retention exceeding regulatory minimums.
  • Engineering Build behavioral anomaly detection rules flagging agents reading more workspace content than their declared scope requires.

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

  1. PromptArmor — Notion AI data exfiltration via markdown image rendering Demonstrated pre-approval exfiltration via PDF injection; remediated Jan 2026
  2. CodeIntegrity — Notion 3.0 web search tool data exfiltration Demonstrated exfiltration via web search URL queries from hidden PDF text
  3. notion-mcp-server #238 — prompt injection via shared page content Confused deputy attack via unsanitized page content; open
  4. notion-mcp-server #237 — path traversal in file upload (CWE-22) CVSS 7.7 est.; arbitrary file read via unsanitized paths; open

Selected Research

  1. Simon Willison — Notion 3.0 data exfiltration vulnerability analysis Amplified CodeIntegrity research; named trifecta conditions explicitly
  2. Bruce Schneier — Notion AI data theft via prompt injection Covered CodeIntegrity research; noted LLM instruction/data confusion

Vendor Documentation

  1. Notion AI security and privacy practices Documents zero retention, Turbopuffer embedding storage, DLP availability
  2. Notion prompt injection protection documentation Documents detection layers, model picker warnings, admin controls
  3. How we built security into Custom Agents Engineering blog on permission model, multilayer injection approach
  4. Custom Agents security features Documents independent permissions, admin Agent Directory controls
  5. Custom Agents documentation Documents triggers, schedules, model selection, Tools and Access config
  6. MCP connections for Custom Agents Documents external MCP server connections and tool confirmation settings
  7. Enterprise Search security and privacy practices Documents embedding persistence, permission validation, anomaly detection

Other Sources

  1. Notion 3.0 release notes Documents AI Agent launch, 20+ minute multi-step autonomous actions
  2. Notion AI Connectors documentation Lists available connectors: Slack, GitHub, Jira, Drive, Teams, Mail