1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. The dominant risks concentrate in the gap between a well-sandboxed development environment and the absence of input filtering, session-level monitoring, and output validation before generated code ships to production.

Key Input Risks

Untrusted content from imported repositories, connected MCP tools, and project configuration files reaches the underlying model without a dedicated prompt shield or injection filter. Research confirms indirect prompt injection via tool descriptions achieves high success rates against coding agents, and MCP tool poisoning can exfiltrate credentials through metadata manipulation. [4][5][14]

Key Execution Risks

Deployed applications leave the browser-based WebContainer sandbox entirely and run on standard hosting infrastructure without the containment the development environment provides. Code generation occurs inside the sandbox, but the security boundary ends at the publish step where generated code ships to production unsandboxed. [8][9]

Key Action Risks

The agent writes, modifies, and deletes project files within a session without per-action operator approval, and creates databases automatically when the project requires one. Publish and deploy actions require explicit operator confirmation, but code-level changes accumulate unchecked between those gates. [6][10]

Key Output Risks

An automatic security audit runs at publish time to flag missing database policies and exposed secrets, but no output filtering or data-loss prevention operates on the agent's in-session outputs. Independent scans of exported projects report that a majority ship with hardcoded credentials and missing security headers. [1][2][7]

Key Monitoring Risks

No audit trail records which prompts were issued, which files were modified, or which tools were invoked during a session. Operators should export session transcripts via GitHub sync and forward deploy events to a centralized log to establish a minimum viable audit trail until native monitoring ships. [13]

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Bolt AI sits at the lower edge of its quadrant, where browser-based containment partially offsets the absence of input and monitoring controls across the default configuration.

AIRQ Metrics

AIRQ Score3.95

Blast Radius5

Attack Surface4.8

Defense Controls4

The agent lands in the Humble Providers quadrant with attack surface elevated by the trifecta floor, a moderate blast radius constrained by the browser sandbox, and a defense score reflecting execution isolation alongside gaps in input filtering and audit capabilities.

Lower AIRQ scores indicate stronger security posture; a defense score of four out of fifteen means the vendor's default controls cover roughly a quarter of the assessed surface, leaving the remainder to operator-managed hardening.

Metric	Score	Comments
AIRQ Score	3.95	The composite score reflects browser-sandboxed execution offsetting absent input guardrails and monitoring on the documented default.
Blast Radius	5 / 10	The browser sandbox contains all six blast factors to the scoped-execution band, preventing escalation to host-level privilege.
Attack Surface	4.8 / 10	Moderate per-surface scores are elevated by the trifecta floor after all three conditions are met on the default configuration.
Defense Controls	4 / 15	Execution isolation and partial output checking provide measurable defense, while input filtering and session monitoring remain absent.

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. The dominant exposures are the unfiltered input channels, MCP connector surface, and the project configuration files that feed the reasoning loop alongside user prompts.

Attack Surface Metrics

User Input2

Tool Execution2

External Data2

Orchestration2

Memory1

Inter-Agent1

Reasoning2

Output Processing1

Planning2

Configuration2

No surface reaches the architectural maximum on the default configuration, and the trifecta floor is the primary driver of the headline attack-surface score.

Each row ties a scored surface to the architectural condition observed in vendor documentation and the evidence anchoring that condition.

Surface	Score	Comments
User Input	2 / 4	Multiple input channels including web chat, imported repos, and MCP connector data feed the reasoning loop with no agent-level input validation or injection detection in place. [6][14]
External Data	2 / 4	GitHub repositories, Figma designs, and MCP connector payloads from user-specified sources enter the reasoning loop; processing occurs inside the WebContainer sandbox but without content validation. [6][11]
Memory	1 / 4	Session-level conversation history persists within a project but no cross-session memory or learning loop exists; claude.md project files provide read-only context loaded at session start. [6][7]
Reasoning	2 / 4	Multi-step reasoning with visible chain-of-thought in the chat interface; reasoning is constrained to the declared task scope but no documented boundary prevents imported repository content from steering the reasoning chain. [6][8]
Planning	2 / 4	An optional structured planning workflow lets operators review the agent's implementation plan before code generation begins; regular prompts execute multi-step plans within the session without intermediate approval. [6][10]
Tool Execution	2 / 4	The agent controls filesystem, Node.js server, npm, terminal, and browser console inside the WebContainer; execution is sandboxed in the browser with no host-level access. [8][9]
Orchestration	2 / 4	Multi-step task execution within a single user-supervised session; no background processes, scheduling, or daemon operation exists on the documented default. [6][10]
Inter-Agent	1 / 4	No inter-agent communication by default; MCP connectors are opt-in and require explicit authentication before external tool actions are enabled. [11]
Output Processing	1 / 4	Basic credential handling through environment variable protection from Viewers and an automatic security audit on publish, but no data-loss prevention on in-session agent outputs. [1][12]
Configuration	2 / 4	Project configuration via claude.md files loaded with the project, team prompts defined by administrators, and MCP connector setup through authenticated credentials. [6][11]

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Bolt AI on its default configuration ingests content from connected repositories and MCP tools, accesses project credentials and database configurations, and deploys generated applications to external hosting without crossing an agent-level input or output control.

Lethal Trifecta · Complete (3 of 3)

Bolt AI exhibits all three of these conditions in its documented default configuration:

Untrusted input — Imported GitHub repositories and MCP connector payloads carry content authored by parties outside the operator's direct control into the reasoning loop. [6][11]
Sensitive data — The agent accesses environment variables, database credentials, and private repository content within the project scope during execution. [12][13]
External egress — Generated applications deploy to external hosting via Bolt Cloud or Netlify, and code syncs to GitHub, sending project content outside the browser sandbox. [6][10]

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. The browser-based WebContainer holds every blast factor within the scoped-execution band, blocking any path to host-level privilege or unrestricted system access.

Blast Radius Metrics

Code execution2

Credential access2

File system access2

Autonomous action2

Network access2

Deployment access2

All six factors sit at the same moderate band because the WebContainer browser sandbox applies a single containment boundary that scopes code, files, and network identically within the browser tab.

The uniform moderate scores reflect the single browser sandbox boundary that contains all agent capabilities equally within the WebContainer tab.

Factor	Score	Comments
Code execution	2 / 4	Node.js execution scoped to the WebContainer with no access to host shell, root, or system-level processes; browser sandbox prevents privilege escalation. [8][9]
File system access	2 / 4	Read-write access within the project directory inside the WebContainer; no path traversal to host filesystem is possible from the browser sandbox. [8][9]
Network access	2 / 4	Outbound HTTP requests route through the browser ServiceWorker API within the WebContainer; no direct TCP, DNS rebinding, or metadata-service access is documented. [8][9]
Credential access	2 / 4	Environment variables and database secrets are accessible within the project scope; Viewer role is restricted from viewing them, but Editors and Co-owners have full access. [12][13]
Autonomous action	2 / 4	All actions are user-initiated via chat prompts; the agent auto-generates code and creates databases during a session but deploy and publish require explicit operator confirmation. [6][10]
Deployment access	2 / 4	Deploy capability to Bolt Cloud and Netlify with operator approval required; team administrators can opt in to restricting available deployment providers, but the restriction is not active by default. [6][10]

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. The vendor ships browser-based execution isolation and a publish-time security audit as defaults, while input filtering and session monitoring remain entirely operator-managed.

Defense Controls Metrics

Input Guardrails0

Execution Isolation2

Action Controls1

Output Guardrails1

Monitoring0

Higher scores indicate stronger vendor-provided safeguards; execution isolation contributes the bulk of the defense posture while three of five components sit at the floor.

Execution isolation carries the defense posture while input filtering, action gating, output scanning, and monitoring each require operator intervention to reach a defended state.

Component	Score	Comments
Input Guardrails	0 / 3	No agent-level prompt shield, content filter, or injection detection is documented; input validation relies entirely on the underlying Claude model's built-in safety. [6][14]
Execution Isolation	2 / 3	WebContainers run inside the browser security sandbox using WebAssembly with cross-origin isolation enforced via COEP and COOP headers, scoping all execution to the browser tab. [8][9]
Action Controls	1 / 3	Publish and deploy require explicit operator confirmation and team admins can restrict providers, but code generation and file modifications execute without per-action approval within a session. [10][12]
Output Guardrails	1 / 3	A publish-time audit catches unprotected database tables and leaked credentials in generated code, but no proactive output filter or credential redaction operates on in-session agent outputs. [1][3][7]
Monitoring	0 / 3	No session audit trail, structured logging, or agent-level telemetry is documented; third-party comparisons confirm the absence of any audit log capability in the default configuration. [13]

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators should prioritize adding input validation before the reasoning loop, establishing session audit trails, and scanning generated code before every deployment.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails

Policy Require security review of MCP connector configurations checking allowed actions, data access scope, and authentication method before enabling in production — counters unfiltered input from connected external tools.
Configuration Restrict MCP connector actions to read-only where write access is not required — counters the default all-actions-enabled posture on authenticated connectors.
Engineering Deploy a prompt-injection detection layer between user input and the Claude model API — counters the absence of any agent-level input filtering.

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation

Policy Require all generated applications to pass a security scan before production deployment — counters the gap between sandboxed development and unsandboxed hosting.
Configuration Enable the built-in security audit for every publish action and do not dismiss flagged issues — counters the risk of deploying code with known vulnerabilities.
Engineering Integrate a container-based staging environment between the WebContainer and production hosting to test generated code under realistic isolation constraints — counters the sandbox-to-production gap.

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls

Policy Establish a team policy requiring code review of all agent-generated changes before merging to the main branch — counters unchecked file modifications within sessions.
Configuration Configure team admin deploy controls to restrict publishing to approved providers only — counters the risk of code shipping to unauthorized hosting targets.
Engineering Build a GitHub Actions workflow that blocks merges until security scans pass on agent-generated pull requests — counters the absence of per-action approval for code changes.

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails

Policy Mandate rotation of all API keys and database credentials after any session where secrets were accessible to the agent — counters potential credential exposure in generated code.
Configuration Move all sensitive values to server-side environment variables and disable source maps in production builds — counters the documented pattern of hardcoded secrets in generated code.
Engineering Add a pre-deploy hook that scans for exposed credentials, missing security headers, and insecure CORS configurations — counters the limited scope of the built-in publish-time audit.

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring

Policy Establish a log-retention policy that requires all agent session transcripts to be exported and archived for forensic review — counters the complete absence of audit trails.
Configuration Forward GitHub sync events and deploy actions to a centralized logging system to create a minimum viable audit trail — counters the lack of native session monitoring.
Engineering Build a webhook receiver for GitHub sync events and deployment callbacks to emit structured telemetry for every code push and deploy action — counters the absence of any built-in observability on the cloud-hosted platform.

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

Bolt.new Security Scan Results Independent scan of 30 bolt.new projects finding 27% with critical issues and an average score of 66.4/100, documenting hardcoded secrets and missing RLS in generated code.

Selected Research

Vibe-Coded Apps Shadow AI Crisis Escape.tech study scanning 5600 vibe-coded apps including bolt.new samples, finding over 2000 vulnerabilities and 400 exposed secrets.
Escape.tech Vibe-Coding Vulnerability Methodology Methodology behind the Escape.tech scan of vibe-coded apps with bolt.new in the expanded target set alongside Lovable and Base44.
QueryIPI Indirect Prompt Injection on Coding Agents Class-level research on query-agnostic indirect prompt injection against coding agents via tool descriptions achieving 87% success rate; not tested against Bolt AI specifically.
MCP Tool Poisoning via Metadata MCP tool poisoning attacks where malicious instructions in tool metadata exfiltrate credentials and hijack actions.

Vendor Documentation

Introduction to Bolt Vendor product overview of Bolt AI as an AI-powered builder with integrated hosting and databases.
Choose an Agent Vendor documentation of Standard and Max agent tiers with Claude model selection.
Introducing WebContainers StackBlitz WebContainer architecture providing browser-based Node.js execution and virtualized networking.
Bolt.new Trust Center Compliance certifications including SOC 2 Type I and II, GDPR, and CCPA.
Bolt Release Notes Vendor changelog with security reviews on publish, MCP connectors, and admin deploy controls.
Bolt Connectors via MCP MCP-based connectors providing integrations with Notion, Linear, GitHub, Miro, Sentry, and more.
Share Your Project Viewer, Editor, and Co-owner permission model with environment variable protection from Viewers.

Other Sources

Why Secure Vibe Coding Is Harder Than It Looks Comparative analysis rating Bolt.new as having minimal access controls, no audit logs, and no on-prem option.
OWASP LLM Prompt Injection Prevention Defense-in-depth patterns for prompt injection prevention including input screening and output filtering.