Cursor Agent Security Risks

Coding Agents cursor.com Exposed Giants
AI RISK QUADRANT POSITION DEFENSE CONTROLS (4) ATTACK SURFACE (8.29) EXPOSED GIANTS FORTIFIED LEADERS HUMBLE PROVIDERS TIGHT OPERATORS
AIRQ Score
4.31
High
Attack Surface
8.29
Critical
Blast Radius
5.88
High
Defense Controls
4
High
About The Agent

Cursor is an AI-powered code editor that operates as a desktop application on developer workstations, providing full shell access, multi-file editing, and MCP server connectivity as its primary tool surface. The same operator-scoped runtime drives terminal commands, file system operations, and external tool integrations, accepts instructions from project rules files and community MCP marketplace, and auto-loads context from the working directory into every prompt without dedicated input filtering.

About the AI Risk Quadrant

Exposed Giants placement means the agent processes untrusted inputs with confirmed egress paths — procurement teams should require container-level isolation and network-layer DLP before deployment in environments handling proprietary code or credentials. The high attack surface reflects multiple critical-severity prompt injection chains while the moderate blast radius is constrained only by the default sandbox boundary, which has documented bypass paths available to determined adversaries.

1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Cursor's primary risks concentrate in its unfiltered ingestion of untrusted project content combined with full shell access and a bypassable approval model on the documented default configuration.

Key Input Risks
Untrusted content from project files, cloned repositories, and community MCP servers reaches the reasoning loop without prompt filtering or injection detection. Independent research confirmed up to 84% command execution success rate through poisoned project resources. [9][10]
Key Execution Risks
The terminal tool provides full shell access with user-level privileges, and once an attacker achieves a workspace file write (via prompt injection or malicious project content), the sandbox boundary has been escaped through .git hook manipulation without further user interaction. Sandbox isolation exists by default but multiple CVE-grade bypasses have been demonstrated. [1][6]
Key Action Risks
Approval gates cover sensitive operations but a single CLI flag bypasses all safety checks, and the progressive allowlist accumulates permanent exemptions without expiration or re-authentication. File writes within the workspace proceed without approval on the default configuration. [15][16]
Key Output Risks
Rich markdown rendering previously allowed Mermaid-based data exfiltration of stored memories and API keys to attacker-controlled servers without user confirmation. Shell-based exfiltration via curl remains available outside the sandbox boundary with no output-layer DLP. [3][12]
Key Monitoring Risks
No runtime anomaly detection exists to flag prompt injection attempts or unusual tool invocation patterns during agent sessions. Enterprise audit logs track AI feature usage and SOC 2 Type II covers organizational controls, but neither provides real-time attack visibility — deploy SIEM alerting on terminal tool invocations to close this gap. [14]

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Cursor carries confirmed code-execution exploitation across multiple attack surfaces with meaningful but bypassable sandbox isolation providing partial mitigation. The Exposed Giants quadrant placement reflects high egress risk despite the dominant threat being unauthorized shell execution rather than passive data leakage.

AIRQ Metrics

Cursor sits in the Exposed Giants quadrant where a high attack surface meets a moderate blast radius with partial defense controls that cannot fully contain the demonstrated attack chains.

Each axis measures a different dimension: Attack Surface exposure out of ten, Blast Radius capability out of ten, Defense Controls protection out of fifteen, and AIRQ composite resilience out of fifteen.

Metric Score Comments
AIRQ Score 4.31 Partial sandbox isolation offsets some attack surface exposure but cannot compensate for absent input and output controls.
Blast Radius 5.88 / 10 Default sandbox constrains blast radius to workspace scope, though bypass paths extend reach to home directory and network.
Attack Surface 8.29 / 10 Multiple critical-severity CVEs confirm exploitation across input, tool execution, inter-agent, and configuration surfaces.
Defense Controls 4 / 15 Execution isolation provides meaningful containment while input guardrails and output monitoring remain absent by default.

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. The dominant exposures are unfiltered multi-channel input ingestion, the auto-loaded project rules and configuration files, and a community MCP marketplace where installed tools inherit shell authority.

Attack Surface Metrics

Six of ten surfaces sit at the adjusted ceiling with confirmed exploitation, reflecting a broad and well-researched attack surface with minimal chokepoints.

Each row ties one attack surface to its strongest evidence anchor, with the score reflecting both architectural exposure and confirmed exploitation history.

Surface Score Comments
User Input 5 / 4 Prompt injection via project files and web content bypasses the allowlist approval gate and achieves arbitrary command execution without user consent. [6][10]
External Data 5 / 4 Workspace files, cloned repositories, and MCP server outputs enter the context without content validation; demonstrated RCE via compromised workspace settings. [4][9]
Memory 2 / 4 Rules files provide persistent cross-session context loaded at startup without integrity verification, though no automated write loop exists. [18]
Reasoning 3 / 4 Multi-step reasoning delegates to interchangeable external LLMs with no independent verification of reasoning chain integrity. [15]
Planning 3 / 4 Autonomous task decomposition with subagent delegation operates under configurable but bypassable approval gates. [15]
Tool Execution 5 / 4 Sandbox escape via .git hook manipulation achieves out-of-sandbox RCE without user interaction on the documented default configuration. [1]
Orchestration 3 / 4 Subagent spawning and CLI headless operation run under the same approval model with no additional orchestration-level controls. [15]
Inter-Agent 5 / 4 MCP OAuth authentication flow enables command injection from malicious MCP servers during the connection handshake. [7][8][19]
Output Processing 4.5 / 4 Mermaid diagram rendering enabled silent data exfiltration of cached credentials and conversation history through image-fetch side channels without user confirmation. [5][12]
Configuration 5 / 4 Case-sensitivity bypass in sensitive file protections enabled modification of .cursor/mcp.json for RCE on case-insensitive filesystems. [2][20]

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Cursor ingests untrusted project content into the same context that accesses developer credentials and source code, while shell and network tools provide default egress channels that bypass the output layer entirely.

Lethal Trifecta · Complete (3 of 3)

Cursor exhibits all three of these conditions in its documented default configuration:

  • Untrusted input — Project files, cloned repositories, MCP server outputs, and .cursorrules from untrusted sources enter the prompt context without filtering. [9][11]
  • Sensitive data — The agent reads source code, configuration files, environment variables, and SSH keys across the developer workspace. [12][13]
  • External egress — Shell commands execute curl and wget for outbound HTTP; Mermaid rendering previously fetched attacker-controlled images carrying exfiltrated data. [5][9]

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. Compromise of the agent grants shell access scoped to user privileges within the sandbox, with demonstrated bypass paths reaching the full home directory and credential stores.

Blast Radius Metrics

Three of six factors sit at the upper band, reflecting full user-level capabilities constrained primarily by the sandbox boundary rather than architectural limitation.

Each row maps a capability factor to the documented scope the agent holds on the default configuration, with scores reflecting the worst-case reachable state.

Factor Score Comments
Code execution 3 / 4 Full shell with user privileges; sandbox escape via .git hooks demonstrated host-level code execution without interaction. [1]
File system access 3 / 4 Read-write access across home directory for non-sandboxed operations; dotfile creation bypass extended write scope beyond workspace. [4][16]
Network access 2 / 4 Network blocked-by-default in sandbox with configurable domain allowlist; non-sandboxed commands have unrestricted outbound access. [17]
Credential access 3 / 4 Demonstrated exfiltration of API keys and stored memories; SSH key access confirmed by independent research. [5][12][13]
Autonomous action 2 / 4 Autonomous command execution requires approval gates bypassed via --yolo flag or progressive allowlisting; prevent by enforcing Use AllowList mode and clearing ~/.cursor/permissions.json per session. [15][16]
Deployment access 1 / 4 Shell access can trigger deployment commands but no dedicated deployment primitives or infrastructure modification tools exist. [16]

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. The vendor ships a meaningful execution sandbox by default but leaves input filtering and output monitoring entirely to the operator on the documented default configuration.

Defense Controls Metrics

Higher scores indicate stronger vendor-implemented safeguards; Cursor provides partial containment through its sandbox while leaving adjacent layers unprotected.

Each component is scored on what ships enabled by default, with opt-in hardening documented separately as operator-managed configuration.

Component Score Comments
Input Guardrails 0 / 3 No prompt shield, injection detection, or content scanning exists at the input layer; all content reaches the model unfiltered. [10][15]
Execution Isolation 2 / 3 Default sandbox restricts filesystem to workspace and blocks network, with meaningful access scoping and configurable domain allowlists. [16][17]
Action Controls 1 / 3 Approval gates exist for sensitive operations but --yolo flag provides single-step bypass and progressive allowlisting accumulates permanent exemptions. [15][16]
Output Guardrails 0 / 3 No DLP, credential redaction, or exfiltration blocking exists in the output layer beyond the patched Mermaid image fetch vulnerability. [5][12]
Monitoring 1 / 3 SOC 2 Type II covers organizational controls; enterprise audit logs available but no runtime anomaly detection for agent behavior. [14][22]

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. The highest-leverage changes are deploying input-layer injection detection, restricting network egress scope, and disabling the progressive allowlist to preserve approval gate integrity.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails
  • Policy Require security review of all .cursorrules and .cursor/rules files before merging into shared repositories — reduces user_input adjusted score from 5.0 toward 3.0 by eliminating the primary injection vector for rule-poisoning attacks. [11][21]
  • Configuration Enable workspace trust via security.workspace.trust.enabled in settings.json to prompt restricted mode for untrusted repositories — counters External Data ingestion from cloned projects.
  • Engineering Deploy a pre-context prompt shield or MCP-based injection scanner that filters project file content before it reaches the model — counters the absent input guardrails at zero. [21]

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation
  • Policy Require all developer workstations to run Cursor with sandbox enabled and prohibit --yolo flag usage in organizational policy — counters the single-step sandbox bypass.
  • Configuration Configure sandbox.json with networkPolicy.default set to deny and restrict additionalReadwritePaths to the minimum required directories — counters network blast and file system reach.
  • Engineering Wrap Cursor execution inside a container or VM boundary that constrains the host filesystem and network independently of the built-in sandbox — reduces code_execution from 3 to 1 and file_system from 3 to 1 by eliminating host-level access even when .git hook escape is triggered.

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls
  • Policy Prohibit Run Everything mode and progressive allowlisting in team policy to preserve approval gate integrity — counters the permanent exemption accumulation pattern.
  • Configuration Remove existing entries from ~/.cursor/permissions.json and disable the Add to allowlist option in team settings — counters the non-expiring allowlist weakness.
  • Engineering Build a startup hook (any tier, no Enterprise required) that clears ~/.cursor/permissions.json on each session start or applies 24-hour expiration to approved patterns — raises action_controls from 1 toward 2 by preventing permanent allowlist accumulation.

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails
  • Policy Require all outbound network requests from agent sessions to route through a corporate proxy with DLP inspection — counters shell-based credential exfiltration via curl.
  • Configuration Configure sandbox.json networkPolicy to deny all outbound except explicitly allowlisted domains needed for build tools — counters unrestricted egress outside sandbox.
  • Engineering Integrate an egress-filtering proxy or MCP security gateway that inspects and blocks sensitive data patterns in outbound requests — counters absent output guardrails at zero.

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring
  • Policy Require forwarding of Cursor enterprise audit logs to a centralized SIEM with alerting rules for unusual tool invocation patterns — counters the absent runtime anomaly detection.
  • Configuration Enable AI code tracking API and audit logs in enterprise settings for visibility into agent actions across the organization — counters the default silent operation mode.
  • Engineering Build an OpenTelemetry-instrumented wrapper around terminal tool invocations that emits structured traces for each shell command and file operation — counters the absent behavioral detection.

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

  1. CVE-2026-26268 Sandbox escape via .git hooks (CVSS 9.9). Prompt injection writes git configuration for out-of-sandbox remote code execution without user interaction. Patched in version 2.5.
  2. CVE-2025-59944 Case-sensitivity file protection bypass (CVSS 9.8). Prompt injection modifies .cursor/mcp.json on case-insensitive filesystems to achieve remote code execution. Patched in version 1.7.
  3. CVE-2025-54132 Mermaid diagram image exfiltration (CVSS 6.4). Attacker exfiltrates stored memories and API keys via image fetch to external server after prompt injection. Patched in version 1.3.
  4. CVE-2025-54130 Dotfile creation authorization bypass (CVSS 9.8). Indirect prompt injection chains to remote code execution via .vscode/settings.json write without approval. Patched in version 1.3.9.
  5. CVE-2025-61590 Remote code execution via VS Code workspaces. Compromised MCP server hijacks chat context to write workspace configuration enabling shell command execution. Patched in version 1.7.
  6. GHSA-hf2x-r83r-qw5q Arbitrary code execution via prompt injection and allowlist bypass (CVE-2026-31854). Malicious website content triggers automatic command execution even in Use AllowList mode. Patched in version 2.0.
  7. CVE-2025-61591 MCP OAuth2 command injection (CVSS 8.8). Malicious MCP server returns crafted commands during OAuth authentication flow achieving remote code execution. Patched in 2025.09.17.
  8. CVE-2025-54136 Persistent RCE via trusted MCP configuration modification. Attacker with repo write access silently swaps approved MCP command for malicious payload without re-prompt. Patched in version 1.3.

Selected Research

  1. Hidden Prompt Injections Can Hijack AI Code Assistants HiddenLayer demonstrates full attack chain on Cursor: indirect prompt injection via README, denylist bypass, credential grep, and exfiltration via curl in Auto-Run mode.
  2. Your AI My Shell Empirical analysis using AIShellJack framework with 314 payloads covering 70 MITRE ATT&CK techniques. Reports up to 84% attack success rate for command execution on Cursor in Auto Mode.
  3. Rules File Backdoor Pillar Security demonstrates invisible Unicode character injection in .cursorrules files that silently influences AI code generation to produce backdoored output.
  4. Cursor Data Exfiltration Via Mermaid Johann Rehberger demonstrates CVE-2025-54132 exploitation: Mermaid diagrams exfiltrate user memories and API keys to attacker-controlled servers via image rendering.
  5. CurseChain Capsule Security demonstrates SSH key exfiltration via invisible HTML comments in referenced repositories, with cross-project contamination and zero-click variants.

Vendor Documentation

  1. Cursor Security Page Vendor security overview documenting SOC 2 Type II attestation, Privacy Mode with zero data retention, annual penetration testing, and responsible disclosure policy.
  2. Agent Security Guardrails Vendor documentation of default approval model, sandbox behavior, auto-run modes, MCP approval gates, and workspace trust configuration options.
  3. Terminal Tool Documentation Vendor documentation of default sandbox behavior for terminal commands including filesystem scope, network blocking, allowlist modes, and sandbox bypass indicators.
  4. Sandbox Configuration Reference Vendor reference for sandbox.json configuration: workspace_readwrite default type, network deny-by-default policy, configurable read-write paths, and enterprise admin overrides.
  5. Rules Documentation Vendor documentation of persistent rules system: project rules in .cursor/rules/, user rules, team rules, auto-loading behavior, and CLAUDE.md compatibility.
  6. MCP Documentation Vendor documentation of Model Context Protocol integration: stdio and SSE transports, tool approval model, security considerations for external server connections.

Other Sources

  1. Cursor Security and Enterprise Readiness Report Endor Labs analysis of Cursor security posture covering prompt injection risks, rules file weaponization vectors, and recommended secure coding practices for AI-generated code.
  2. Secure Vibe Coding with Cursor Rules Cloud Security Alliance blog discussing Cursor rules as both a productivity tool and a security risk vector, with the RAILGUARD framework for secure rule authoring.
  3. Cursor Compliance Documents Vendor compliance page listing available security documents: SOC 2 Type II report, penetration test report, DPA, and MSA through the trust portal.