1 Key Risks
The most critical security risks an operator inherits when deploying this agent in its documented default configuration. The dominant risks concentrate in unfiltered multi-channel input ingestion, unsandboxed shell execution, and the absence of per-action approval gates on a persistent daemon runtime.
2 AIRQ Scores
The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Nanobot lands in the lower-defense band with a high attack surface driven by critical-severity bridge and email-channel CVEs, offset only by basic file-based logging.
The agent sits in the Exposed Giants quadrant where a high attack surface meets a moderate blast radius, and vendor-shipped defense controls contribute almost nothing to the composite score.
Attack Surface and Blast Radius are each scored out of ten, Defense Controls out of fifteen, and the AIRQ composite weights defense as a multiplier on capability per unit of risk.
| Metric | Score | Comments |
|---|---|---|
| AIRQ Score | 3.14 | The composite reflects a high-exposure runtime where vendor defenses contribute almost no risk reduction, leaving hardening entirely to the operator. |
| Blast Radius | 6.75 / 10 | Full shell and file system access on operator hardware drive the blast radius; the single-operator local runtime lacks dedicated deployment or infrastructure tools. [6] |
| Attack Surface | 8.28 / 10 | Multiple critical-severity CVEs against the bridge and email channel push the attack surface into the upper band with the trifecta fully triggered. [2][3] |
| Defense Controls | 1 / 15 | A single point of defense from basic file logging leaves the operator with no vendor-shipped detection, containment, or approval controls. [10] |
3 Attack Surface
Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. The dominant exposures are unfiltered multi-channel input ingestion, an unsandboxed shell runtime with trivially bypassed command filtering, and communication bridges with critical-severity authentication flaws. [8]
Five of ten surfaces sit at the adjusted ceiling and every surface scores at least three, reflecting a broad ingestion fan-in with no single hardened chokepoint.
Each row maps an entry point to its base architectural band, any agent-specific evidence penalty, and a brief comment citing the strongest anchor.
| Surface | Score | Comments |
|---|---|---|
| User Input | 5 / 4 | Email polling, messaging platforms, WebSocket, and MCP inputs reach the agent with no prompt shield; CVE-2026-33654 proved zero-click injection via a crafted email. [2] |
| External Data | 5 / 4 | The email channel auto-ingests attacker-authored content as trusted input, bypassing channel isolation and enabling arbitrary tool invocation without operator interaction. [4] |
| Memory | 3 / 4 | Persistent SOUL, USER, and MEMORY markdown files survive restarts with automated Dream writes and no integrity verification or poisoning detection. [12] |
| Reasoning | 3 / 4 | Model-agnostic architecture delegates reasoning to interchangeable external LLMs with no chain-of-thought audit or alignment verification mechanism. [11] |
| Planning | 3 / 4 | Dream cron scheduling and gateway daemon mode enable autonomous multi-step planning without operator-visible plan review or approval gates. [11] |
| Tool Execution | 5 / 4 | ExecTool provides full shell via create_subprocess_shell with a regex blocklist that independent testing bypassed in every attempt. [6] |
| Orchestration | 3 / 4 | Gateway daemon manages all enabled channels and Dream cron tasks as a persistent background service with systemd auto-restart. [12] |
| Inter-Agent | 5 / 4 | The WhatsApp bridge WebSocket bound to all interfaces without authentication, allowing remote session hijack and message interception. [1][7] |
| Output Processing | 3 / 4 | Agent output flows through messaging channels and shell without credential redaction, URL sanitization, or data-loss controls. [10] |
| Configuration | 5 / 4 | The WhatsApp bridge shipped with token authentication disabled by default and no Origin validation, enabling cross-site WebSocket hijacking. [3][5] |
The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Nanobot ingests attacker-crafted email and messaging content, reads the operator's file system and plaintext API keys, and transmits through messaging channels and shell-driven HTTP without crossing any vendor-shipped control.
Nanobot exhibits all three of these conditions in its documented default configuration:
- Untrusted input — Email polling, messaging platforms, and MCP tool outputs feed untrusted bytes directly into the agent loop. [2][4]
- Sensitive data — The agent reads the full home directory, plaintext API keys in config.json, and persistent memory files by default. [10][6]
- External egress — Messaging channels, shell-driven outbound HTTP, and web search send bytes outside the operator's trust boundary. [1][12]
4 Blast Radius
The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. Compromise of the agent equates to operator-level access to the host shell, file system, network, and stored credentials, constrained only by the absence of dedicated deployment tooling.
Five of six factors score at the upper band and only deployment access remains low, reflecting a single-operator local runtime with broad host authority.
Each row ties a blast factor to the documented default capability the agent holds over the operator's environment.
| Factor | Score | Comments |
|---|---|---|
| Code execution | 3 / 4 | ExecTool provides full shell with operator-level privileges; the regex blocklist is trivially bypassed per independent testing. [6][13] |
| File system access | 3 / 4 | Path traversal confirmed with the majority of test operations succeeding; restrictToWorkspace is false by default. [6] |
| Network access | 3 / 4 | Unrestricted outbound HTTP via web search and MCP tools; basic SSRF protection documented but no domain allowlist. [11] |
| Credential access | 3 / 4 | API keys stored in plaintext config.json accessible to the agent; vendor acknowledges keys are in plain text by default. [10] |
| Autonomous action | 3 / 4 | Gateway daemon runs persistently with cron-scheduled Dream; no per-action operator approval is required or available. [12] |
| Deployment access | 1 / 4 | No dedicated deployment, publish, or infrastructure modification tools are built in; shell access is generic. [14] |
5 Defense Controls
Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. The vendor publishes security guidance but ships almost no default-on controls; sandboxing, workspace restriction, and bridge authentication are all opt-in. [10]
The inverted scale shows that higher scores mean stronger vendor safeguards; the near-zero total reflects a runtime where nearly every control is operator-managed.
Each component is scored on what the vendor implements by default versus what the operator must configure after deployment.
| Component | Score | Comments |
|---|---|---|
| Input Guardrails | 0 / 3 | No prompt shield, content filter, or injection detection ships by default; vendor documentation lists this as a known limitation. [10] |
| Execution Isolation | 0 / 3 | No sandbox by default; bubblewrap is opt-in and Linux-only, leaving the host exposed on the documented default configuration. [11] |
| Action Controls | 0 / 3 | No per-action approval gate; the regex command blocklist is the only barrier and independent testing bypassed it completely. [6] |
| Output Guardrails | 0 / 3 | No DLP, credential redaction, or exfiltration blocking; vendor acknowledges logs may contain sensitive information. [10] |
| Monitoring | 1 / 3 | Basic file-based logging and failed-auth logging exist but no SIEM integration, alerting, or anomaly detection ships by default. [10] |
6 Hardening Tips
Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. The highest-leverage changes are enabling the bubblewrap sandbox, activating workspace restriction, and deploying an external prompt injection classifier ahead of the agent loop.
Input Guardrails
Input guardrails intercept adversarial content before it reaches the reasoning loop.
- Policy Require all messaging channels to use the allowFrom allowlist and disable email polling unless the operator verifies SPF and DKIM on inbound messages — counters User Input at adjusted ceiling. [2]
- Configuration Configure allowFrom on every enabled channel to restrict input to known operator identities — counters the zero-click email injection demonstrated by CVE-2026-33654. [2]
- Engineering Deploy an external prompt injection classifier upstream of the agent loop to filter adversarial payloads before they reach the reasoning engine — counters the absence of any input guardrail. [9]
Execution Isolation
Execution isolation contains what a compromised agent can do on the host.
- Policy Require all production deployments to run inside the Docker container with the non-root user and bubblewrap sandbox enabled — counters Execution Isolation at zero. [13]
- Configuration Enable bubblewrap sandboxing and workspace restriction in config.json to confine shell and file operations to the workspace directory — counters the default unsandboxed runtime. [11]
- Engineering Integrate OS-level sandboxing via Landlock or seccomp-bpf to enforce kernel-level filesystem and network boundaries per tool call — counters regex-only command filtering. [13]
Action Controls
Action controls govern which tools and actions the agent can invoke autonomously.
- Policy Establish an internal policy requiring operator review of all shell commands before execution in production environments — counters the absence of any approval gate. [6]
- Configuration Set tools.exec.enable to false for channels that do not require shell access, limiting the tool surface exposed to untrusted input — counters Action Controls at zero. [11]
- Engineering Build a per-command approval hook into the ExecTool that pauses execution and notifies the operator before running shell commands — counters the complete absence of approval gates. [8]
Output Guardrails
Output guardrails inspect what the agent sends to other systems and users.
- Policy Define data classification rules that prohibit the agent from emitting API keys, tokens, or credentials through any output channel — counters Output Guardrails at zero. [10]
- Configuration Configure the WhatsApp bridge with BRIDGE_TOKEN authentication enabled and validate Origin headers on WebSocket connections — counters the bridge hijack demonstrated by CVE-2026-2577. [5]
- Engineering Wire a DLP scanner into the agent output path to redact credentials and block exfiltration through messaging channels and shell stdout — counters the total absence of output filtering. [10]
Monitoring
Monitoring captures what the agent did and surfaces anomalies for review.
- Policy Require centralized log collection and periodic review of agent action logs as part of the operational security program — counters the lack of active monitoring. [10]
- Configuration Forward nanobot.log to a SIEM or log aggregation service and configure alerts for failed authentication attempts and unexpected tool invocations — counters basic logging with no alerting. [10]
- Engineering Instrument the agent loop with OpenTelemetry spans to capture per-tool-call latency, token usage, and anomalous command patterns — counters the absence of anomaly detection. [14]
7 References
The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.
Selected Vulnerabilities
- CVE-2026-2577 WhatsApp bridge WebSocket session hijack (CVSS 10.0); patched in v0.1.3.post7
- CVE-2026-33654 Zero-click indirect prompt injection via email polling (CVSS 9.8); patched in v0.1.6
- CVE-2026-35589 Cross-Site WebSocket Hijacking in WhatsApp bridge (CVSS 9.3); patched in v0.1.5
- GHSA-4gmr-2vc8-7qh3 Email channel prompt injection exploit chain from IMAP polling to ExecTool invocation
- GHSA-v5j3-4q66-58cf CSWSH vulnerability with incomplete remediation of CVE-2026-2577 in bridge WebSocket
- Security audit findings Community security audit documenting shell injection bypasses, path traversal, and sandbox limitations
Selected Research
- Tenable TRA-2026-09 Independent advisory on unauthenticated WhatsApp bridge vulnerability with disclosure timeline
- AgentSentry: Mitigating Indirect Prompt Injection Class-level defense framework for tool-augmented LLM agents facing indirect prompt injection
- VPI-Bench: Visual Prompt Injection for Computer-Use Agents Class-level benchmark of visual prompt injection attacks against computer-use agents
Vendor Documentation
- Nanobot Security Policy Vendor security overview documenting disclosure process and known security limitations
- Nanobot Configuration Reference Vendor documentation covering tool settings, MCP integration, sandbox opt-in, and channel access control
- Nanobot Memory and Dream Vendor documentation for persistent memory, Dream consolidation, and automated skill discovery
Other Sources
- OS-level sandboxing proposal Community proposal documenting bypassable regex filtering and proposing Landlock/seccomp-bpf enforcement
- Nanobot Security and Sandboxing Analysis Third-party analysis of defense-in-depth architecture covering sandboxing and command filtering