1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. The dominant risks concentrate in unfiltered multi-channel input ingestion, unsandboxed shell execution, and the absence of per-action approval gates on a persistent daemon runtime.

Key Input Risks

Untrusted content from email polling, messaging platforms, WebSocket, and MCP tool outputs reaches the reasoning loop without a prompt shield or content filter. [2] CVE-2026-33654 demonstrated zero-click indirect prompt injection via a crafted email that the agent auto-processed as trusted input, enabling arbitrary tool invocation. [4]

Key Execution Risks

ExecTool grants full shell access through create_subprocess_shell with a pattern-based command filter that a community audit defeated on every test case. [6] No sandbox ships by default; bubblewrap isolation is opt-in and Linux-only, leaving the host file system and process table exposed. [13]

Key Action Risks

No per-action approval gate exists on the default configuration; the agent executes shell commands, file operations, and MCP tool calls without operator confirmation. [12] The gateway daemon runs autonomously around the clock with Dream cron tasks firing without human review.

Key Output Risks

No outbound data-loss prevention, credential scrubbing, or URL filtering exists on any default output channel including messaging, shell, and WebSocket. [10] CVE-2026-2577 demonstrated that the WhatsApp bridge session could be hijacked to intercept and send messages on behalf of the operator. [1]

Key Monitoring Risks

Basic file-based logging exists but vendor documentation lists no audit trail, no active monitoring, and limited security event logging as known limitations. [10] No SIEM integration, no anomaly detection, and no compliance-grade observability ship with the default configuration. [14]

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Nanobot lands in the lower-defense band with a high attack surface driven by critical-severity bridge and email-channel CVEs, offset only by basic file-based logging.

AIRQ Metrics

AIRQ Score3.14

Blast Radius6.75

Attack Surface8.28

Defense Controls1

The agent sits in the Exposed Giants quadrant where a high attack surface meets a moderate blast radius, and vendor-shipped defense controls contribute almost nothing to the composite score.

Attack Surface and Blast Radius are each scored out of ten, Defense Controls out of fifteen, and the AIRQ composite weights defense as a multiplier on capability per unit of risk.

Metric	Score	Comments
AIRQ Score	3.14	The composite reflects a high-exposure runtime where vendor defenses contribute almost no risk reduction, leaving hardening entirely to the operator.
Blast Radius	6.75 / 10	Full shell and file system access on operator hardware drive the blast radius; the single-operator local runtime lacks dedicated deployment or infrastructure tools. [6]
Attack Surface	8.28 / 10	Multiple critical-severity CVEs against the bridge and email channel push the attack surface into the upper band with the trifecta fully triggered. [2][3]
Defense Controls	1 / 15	A single point of defense from basic file logging leaves the operator with no vendor-shipped detection, containment, or approval controls. [10]

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. The dominant exposures are unfiltered multi-channel input ingestion, an unsandboxed shell runtime with trivially bypassed command filtering, and communication bridges with critical-severity authentication flaws. [8]

Attack Surface Metrics

User Input5

Tool Execution5

External Data5

Orchestration3

Memory3

Inter-Agent5

Reasoning3

Output Processing3

Planning3

Configuration5

Five of ten surfaces sit at the adjusted ceiling and every surface scores at least three, reflecting a broad ingestion fan-in with no single hardened chokepoint.

Each row maps an entry point to its base architectural band, any agent-specific evidence penalty, and a brief comment citing the strongest anchor.

Surface	Score	Comments
User Input	5 / 4	Email polling, messaging platforms, WebSocket, and MCP inputs reach the agent with no prompt shield; CVE-2026-33654 proved zero-click injection via a crafted email. [2]
External Data	5 / 4	The email channel auto-ingests attacker-authored content as trusted input, bypassing channel isolation and enabling arbitrary tool invocation without operator interaction. [4]
Memory	3 / 4	Persistent SOUL, USER, and MEMORY markdown files survive restarts with automated Dream writes and no integrity verification or poisoning detection. [12]
Reasoning	3 / 4	Model-agnostic architecture delegates reasoning to interchangeable external LLMs with no chain-of-thought audit or alignment verification mechanism. [11]
Planning	3 / 4	Dream cron scheduling and gateway daemon mode enable autonomous multi-step planning without operator-visible plan review or approval gates. [11]
Tool Execution	5 / 4	ExecTool provides full shell via create_subprocess_shell with a regex blocklist that independent testing bypassed in every attempt. [6]
Orchestration	3 / 4	Gateway daemon manages all enabled channels and Dream cron tasks as a persistent background service with systemd auto-restart. [12]
Inter-Agent	5 / 4	The WhatsApp bridge WebSocket bound to all interfaces without authentication, allowing remote session hijack and message interception. [1][7]
Output Processing	3 / 4	Agent output flows through messaging channels and shell without credential redaction, URL sanitization, or data-loss controls. [10]
Configuration	5 / 4	The WhatsApp bridge shipped with token authentication disabled by default and no Origin validation, enabling cross-site WebSocket hijacking. [3][5]

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Nanobot ingests attacker-crafted email and messaging content, reads the operator's file system and plaintext API keys, and transmits through messaging channels and shell-driven HTTP without crossing any vendor-shipped control.

Lethal Trifecta · Complete (3 of 3)

Nanobot exhibits all three of these conditions in its documented default configuration:

Untrusted input — Email polling, messaging platforms, and MCP tool outputs feed untrusted bytes directly into the agent loop. [2][4]
Sensitive data — The agent reads the full home directory, plaintext API keys in config.json, and persistent memory files by default. [10][6]
External egress — Messaging channels, shell-driven outbound HTTP, and web search send bytes outside the operator's trust boundary. [1][12]

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. Compromise of the agent equates to operator-level access to the host shell, file system, network, and stored credentials, constrained only by the absence of dedicated deployment tooling.

Blast Radius Metrics

Code execution3

Credential access3

File system access3

Autonomous action3

Network access3

Deployment access1

Five of six factors score at the upper band and only deployment access remains low, reflecting a single-operator local runtime with broad host authority.

Each row ties a blast factor to the documented default capability the agent holds over the operator's environment.

Factor	Score	Comments
Code execution	3 / 4	ExecTool provides full shell with operator-level privileges; the regex blocklist is trivially bypassed per independent testing. [6][13]
File system access	3 / 4	Path traversal confirmed with the majority of test operations succeeding; restrictToWorkspace is false by default. [6]
Network access	3 / 4	Unrestricted outbound HTTP via web search and MCP tools; basic SSRF protection documented but no domain allowlist. [11]
Credential access	3 / 4	API keys stored in plaintext config.json accessible to the agent; vendor acknowledges keys are in plain text by default. [10]
Autonomous action	3 / 4	Gateway daemon runs persistently with cron-scheduled Dream; no per-action operator approval is required or available. [12]
Deployment access	1 / 4	No dedicated deployment, publish, or infrastructure modification tools are built in; shell access is generic. [14]

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. The vendor publishes security guidance but ships almost no default-on controls; sandboxing, workspace restriction, and bridge authentication are all opt-in. [10]

Defense Controls Metrics

Input Guardrails0

Execution Isolation0

Action Controls0

Output Guardrails0

Monitoring1

The inverted scale shows that higher scores mean stronger vendor safeguards; the near-zero total reflects a runtime where nearly every control is operator-managed.

Each component is scored on what the vendor implements by default versus what the operator must configure after deployment.

Component	Score	Comments
Input Guardrails	0 / 3	No prompt shield, content filter, or injection detection ships by default; vendor documentation lists this as a known limitation. [10]
Execution Isolation	0 / 3	No sandbox by default; bubblewrap is opt-in and Linux-only, leaving the host exposed on the documented default configuration. [11]
Action Controls	0 / 3	No per-action approval gate; the regex command blocklist is the only barrier and independent testing bypassed it completely. [6]
Output Guardrails	0 / 3	No DLP, credential redaction, or exfiltration blocking; vendor acknowledges logs may contain sensitive information. [10]
Monitoring	1 / 3	Basic file-based logging and failed-auth logging exist but no SIEM integration, alerting, or anomaly detection ships by default. [10]

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. The highest-leverage changes are enabling the bubblewrap sandbox, activating workspace restriction, and deploying an external prompt injection classifier ahead of the agent loop.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails

Policy Require all messaging channels to use the allowFrom allowlist and disable email polling unless the operator verifies SPF and DKIM on inbound messages — counters User Input at adjusted ceiling. [2]
Configuration Configure allowFrom on every enabled channel to restrict input to known operator identities — counters the zero-click email injection demonstrated by CVE-2026-33654. [2]
Engineering Deploy an external prompt injection classifier upstream of the agent loop to filter adversarial payloads before they reach the reasoning engine — counters the absence of any input guardrail. [9]

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation

Policy Require all production deployments to run inside the Docker container with the non-root user and bubblewrap sandbox enabled — counters Execution Isolation at zero. [13]
Configuration Enable bubblewrap sandboxing and workspace restriction in config.json to confine shell and file operations to the workspace directory — counters the default unsandboxed runtime. [11]
Engineering Integrate OS-level sandboxing via Landlock or seccomp-bpf to enforce kernel-level filesystem and network boundaries per tool call — counters regex-only command filtering. [13]

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls

Policy Establish an internal policy requiring operator review of all shell commands before execution in production environments — counters the absence of any approval gate. [6]
Configuration Set tools.exec.enable to false for channels that do not require shell access, limiting the tool surface exposed to untrusted input — counters Action Controls at zero. [11]
Engineering Build a per-command approval hook into the ExecTool that pauses execution and notifies the operator before running shell commands — counters the complete absence of approval gates. [8]

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails

Policy Define data classification rules that prohibit the agent from emitting API keys, tokens, or credentials through any output channel — counters Output Guardrails at zero. [10]
Configuration Configure the WhatsApp bridge with BRIDGE_TOKEN authentication enabled and validate Origin headers on WebSocket connections — counters the bridge hijack demonstrated by CVE-2026-2577. [5]
Engineering Wire a DLP scanner into the agent output path to redact credentials and block exfiltration through messaging channels and shell stdout — counters the total absence of output filtering. [10]

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring

Policy Require centralized log collection and periodic review of agent action logs as part of the operational security program — counters the lack of active monitoring. [10]
Configuration Forward nanobot.log to a SIEM or log aggregation service and configure alerts for failed authentication attempts and unexpected tool invocations — counters basic logging with no alerting. [10]
Engineering Instrument the agent loop with OpenTelemetry spans to capture per-tool-call latency, token usage, and anomalous command patterns — counters the absence of anomaly detection. [14]

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

CVE-2026-2577 WhatsApp bridge WebSocket session hijack (CVSS 10.0); patched in v0.1.3.post7
CVE-2026-33654 Zero-click indirect prompt injection via email polling (CVSS 9.8); patched in v0.1.6
CVE-2026-35589 Cross-Site WebSocket Hijacking in WhatsApp bridge (CVSS 9.3); patched in v0.1.5
GHSA-4gmr-2vc8-7qh3 Email channel prompt injection exploit chain from IMAP polling to ExecTool invocation
GHSA-v5j3-4q66-58cf CSWSH vulnerability with incomplete remediation of CVE-2026-2577 in bridge WebSocket
Security audit findings Community security audit documenting shell injection bypasses, path traversal, and sandbox limitations

Selected Research

Tenable TRA-2026-09 Independent advisory on unauthenticated WhatsApp bridge vulnerability with disclosure timeline
AgentSentry: Mitigating Indirect Prompt Injection Class-level defense framework for tool-augmented LLM agents facing indirect prompt injection
VPI-Bench: Visual Prompt Injection for Computer-Use Agents Class-level benchmark of visual prompt injection attacks against computer-use agents

Vendor Documentation

Nanobot Security Policy Vendor security overview documenting disclosure process and known security limitations
Nanobot Configuration Reference Vendor documentation covering tool settings, MCP integration, sandbox opt-in, and channel access control
Nanobot Memory and Dream Vendor documentation for persistent memory, Dream consolidation, and automated skill discovery

Other Sources

OS-level sandboxing proposal Community proposal documenting bypassable regex filtering and proposing Landlock/seccomp-bpf enforcement
Nanobot Security and Sandboxing Analysis Third-party analysis of defense-in-depth architecture covering sandboxing and command filtering