1 Key Risks
The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Leon's default configuration exposes an unfiltered multi-channel input surface, unrestricted host-level tool execution, and no operator-facing monitoring or approval controls.
2 AIRQ Scores
The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Leon presents a moderate aggregate risk driven by broad host-level tool authority, high blast radius, and zero documented defense controls on the vendor default.
Leon sits in the Exposed Giants quadrant because the agent's broad tool execution authority and high blast radius are not offset by any vendor-documented defense controls.
The scores below reflect architectural exposure on the vendor-documented default configuration, anchored on vendor documentation and the open-source repository.
| Metric | Score | Comments |
|---|---|---|
| AIRQ Score | 3 | Moderate aggregate risk driven by high blast radius and absent defense controls offsetting a moderate attack surface. |
| Blast Radius | 6.38 / 10 | Shell, file system, network, and credential access all reach operator-level privilege on the host; only deployment tooling is absent. |
| Attack Surface | 5.64 / 10 | Eight of ten surfaces carry high exposure on the documented default; orchestration and inter-agent score lower due to single-instance design. |
| Defense Controls | 0 / 15 | No input guardrails, execution isolation, action controls, output guardrails, or monitoring documented for the default deployment. |
3 Attack Surface
Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. The dominant exposures are the unfiltered multi-channel input surface, the auto-loaded persistent memory, and a community skill marketplace whose installed code inherits full tool authority.
Most surfaces cluster at high exposure, reflecting broad default capabilities with no vendor-documented filtering, isolation, or approval controls.
The table below ties each surface to its strongest vendor-documented evidence anchor and identifies the architectural condition driving the base band.
| Surface | Score | Comments |
|---|---|---|
| User Input | 3 / 4 | Leon accepts natural-language input from a web UI, HTTP API query endpoint, WebSocket connections, and voice transcription through multiple TTS/STT bridges. No prompt shield, input sanitizer, or injection detection layer is documented for any channel, leaving the agent exposed to the prompt injection patterns cataloged in class-level taxonomies. The HTTP API relies on a single static API key for access control across every module action endpoint. [6][9][1] |
| External Data | 3 / 4 | Built-in web_search and read_url_content tools fetch content from arbitrary external URLs and inject it into the prompt context without sanitization or content-type restrictions. Fetched web content is processed by the same reasoning loop that handles trusted operator input, creating an indirect injection surface through any page the agent is directed to read. [4][7] |
| Memory | 3 / 4 | Leon implements a layered persistent memory system with short-term, long-term, and contextual tiers that persist across sessions and accept automated writes from the reasoning loop. No integrity verification, signing, or access control protects stored memory entries from tampering through prompt injection or direct file modification. [8][7] |
| Reasoning | 3 / 4 | The reasoning loop delegates to interchangeable external LLM providers with no model-level guardrails, system prompt hardening, or reasoning-chain validation documented by the vendor. Model selection is a runtime configuration choice with no security differentiation between providers, and the architecture is susceptible to the prompt-driven command-and-control techniques documented in adversarial AI threat landscapes. [8][7][2] |
| Planning | 3 / 4 | Leon's agent mode enables autonomous multi-step planning where the model generates and executes action sequences without per-step operator approval. The planning loop can chain tool calls including shell execution, file writes, and web requests in a single autonomous session with no built-in circuit breaker, matching the autonomous-agent risk patterns described in class-level threat frameworks. [8][7][3] |
| Tool Execution | 3 / 4 | The run_command tool provides direct OS-native shell access at the privilege level of the server process. File system tools operate on the home directory scope, and no tool allowlist, denylist, or permission boundary constrains which tools the reasoning loop may invoke during a session. [6][9] |
| Orchestration | 2 / 4 | Leon operates as a single-instance server with a linear skill-dispatch architecture. No multi-agent orchestration protocol, message bus, or centralized coordination layer is documented. The skill system provides modular dispatch but not cross-agent routing or federated execution, which limits the orchestration attack surface. [7][8] |
| Inter-Agent | 2 / 4 | No documented inter-agent communication protocol, delegation mechanism, or agent-to-agent trust boundary exists in the current release. MCP server support is documented but scoped to tool-provider integration, not peer-agent message exchange, which limits the inter-agent attack surface to external tool providers. [9][7] |
| Output Processing | 3 / 4 | Leon emits output through the web UI, HTTP API response bodies, WebSocket streams, and text-to-speech synthesis without any documented filtering or redaction layer. No data-loss prevention or URL sanitization exists for the default deployment, so model-generated content reaches the operator and downstream API consumers in raw form. [4][7] |
| Configuration | 3 / 4 | Leon auto-loads skills from the skills directory at startup, and the dynamic load_skill tool enables runtime skill loading without restart. The skills marketplace provides community-published code that inherits full tool authority. No skill signing, sandboxing, or capability restriction is documented for loaded skills. [10][12] |
The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Leon exhibits all three on the documented default: a single injected instruction reaching the reasoning loop can read persistent memory and host credentials through shell access, then transmit them externally through web_search, read_url_content, or any shell-invoked network utility without crossing a vendor-documented control boundary.
Leon exhibits all three of these conditions in its documented default configuration:
- Untrusted input — Leon processes untrusted content from web UI text, HTTP API payloads, voice transcription, and web-fetched content from arbitrary URLs, all flowing into the same prompt context with no filtering layer. [6][9]
- Sensitive data — Leon accesses sensitive operator data through OS-native shell execution at operator privilege, file system tools scoped to the home directory, and layered persistent memory that accumulates conversation history across sessions. [9][8]
- External egress — Leon transmits data externally through web_search and read_url_content outbound HTTP requests, HTTP API response channels, and any shell command capable of network access without outbound data-loss prevention or URL allowlisting. [7][9]
4 Blast Radius
The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. Compromise of the reasoning loop grants the attacker the same shell, file system, network, and credential access the operator holds on the host, bounded only by the absence of dedicated deployment tooling.
Four of six factors reach the high-exposure band, reflecting direct host-level access through the documented tool set on the default deployment.
The table below maps each blast factor to its enabling tool; most factors trace back to the same run_command shell and the file system I/O tools.
| Factor | Score | Comments |
|---|---|---|
| Code execution | 3 / 4 | The run_command tool executes arbitrary shell commands at the privilege level of the server process with no sandbox, chroot, or container boundary. Any instruction reaching the reasoning loop can invoke OS-native code execution on the host through documented tool invocation. [6][9] |
| File system access | 3 / 4 | The read_file, write_file, and edit_file tools provide read-write access to the home directory scope. No path allowlist, chroot jail, or mandatory access control constrains file operations, enabling modification of configuration files, scripts, and dotfiles within the operator home tree. [4][9] |
| Network access | 3 / 4 | Outbound HTTP connectivity is available through both dedicated browsing tools and raw shell access, with no domain allowlist, egress firewall, or rate limiting applied to any path. The run_command tool can invoke curl, wget, ssh, or any network utility the operator shell session can reach. [4][7] |
| Credential access | 3 / 4 | The .env file storing API keys for LLM providers and third-party services is accessible through shell execution. Any credential, token, or secret file readable by the operator user account is reachable through the run_command or file system tools. [6][9] |
| Autonomous action | 2 / 4 | Agent mode enables multi-step autonomous execution chains, and the runtime skill-loading capability means the effective tool set can expand during a session. No dedicated workflow-triggering or webhook-firing capability beyond raw shell and HTTP access is documented, but the dynamic load_skill tool could introduce additional mutation surfaces at runtime. [8][12] |
| Deployment access | 1 / 4 | Leon does not include dedicated CI/CD integration, infrastructure-as-code tooling, or container orchestration capabilities in the default skill set. Deployment-level access is theoretically reachable through raw shell commands but requires operator-specific tooling not shipped with the agent. [7] |
5 Defense Controls
Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. The vendor-documented default ships with no input guardrails, execution isolation, action approval gates, output filtering, or monitoring infrastructure, leaving every defense component at the lowest band.
The operator inherits every risk from the attack surface and blast radius assessments with no vendor-provided mitigation on the default deployment.
The table below identifies the documented posture for each defense component and the evidence confirming the absence of each control.
| Component | Score | Comments |
|---|---|---|
| Input Guardrails | 0 / 3 | No input validation, prompt shield, injection detection, or content-type filtering is documented for any input channel. Vendor documentation describes direct passthrough of user text to the reasoning loop without intermediate processing. No vendor-provided opt-in filtering option is documented for the current release. [5][9] |
| Execution Isolation | 0 / 3 | No sandbox, container, chroot, or privilege-separation boundary is documented for tool execution. The run_command tool executes at the same privilege level as the server process. Vendor documentation confirms direct host-level execution with no isolation tier on the default deployment. [6][9] |
| Action Controls | 0 / 3 | No per-action approval gate, confirmation prompt, tool allowlist, or capability restriction is documented for the default configuration. Agent mode executes autonomously without operator consent per action, and no bypass flag is needed because approval is absent by default. [8][9] |
| Output Guardrails | 0 / 3 | No output filtering, content moderation, data-loss prevention, PII redaction, or URL sanitization is documented for any output channel. Model-generated output reaches the web UI, HTTP API, and TTS pipeline without intermediate inspection or redaction controls. [4][9] |
| Monitoring | 0 / 3 | No logging infrastructure, audit trail, anomaly detection, or SIEM integration is documented for the default deployment. The vendor blog and repository do not describe observability features for tool invocations, memory writes, or outbound network requests. Confidence is approximate rather than verified because internal logging may exist undocumented; the open issue tracker contains no monitoring-related feature requests. [8][7][11] |
6 Hardening Tips
Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Every hardening recommendation below is an operator-added layer; the vendor-documented default ships none of these controls, so each tip directly counters a specific finding from the risk and blast assessments.
Input Guardrails
Input guardrails intercept adversarial content before it reaches the reasoning loop.
- Policy Require prompt injection detection on all inbound channels before the reasoning loop — counters the absence of input validation on the web UI, HTTP API, and voice transcription surfaces.
- Configuration Restrict the HTTP API to localhost-only binding or add mutual TLS authentication — counters the single static API key that is the only access control for all module action endpoints.
- Engineering Deploy a content-type validation and payload-length middleware on the HTTP API query endpoint — counters prompt injection through oversized or binary payloads reaching the reasoning loop via the unauthenticated API.
Execution Isolation
Execution isolation contains what a compromised agent can do on the host.
- Policy Mandate that the Leon server process runs inside a container or dedicated VM with minimal host access — counters the absence of any sandbox or privilege-separation boundary for shell execution.
- Configuration Apply a seccomp profile or AppArmor policy to restrict system calls available to the server process — counters operator-level shell execution privilege on the host.
- Engineering Configure a non-root service account with minimal file system permissions for the server daemon — counters the default operator-privilege execution model.
Action Controls
Action controls govern which tools and actions the agent can invoke autonomously.
- Policy Implement a tool allowlist restricting which tools the reasoning loop may invoke without explicit operator sign-off — counters unrestricted tool invocation in the default agent mode.
- Configuration Disable agent mode or add a confirmation prompt for destructive actions including shell commands, file writes, and network requests — counters autonomous execution without operator review.
- Engineering Build a wrapper around the run_command and write_file tool handlers that enforces a per-invocation approval callback before execution — counters the absence of per-action consent in the default tool dispatch.
Output Guardrails
Output guardrails inspect what the agent sends to other systems and users.
- Policy Deploy a data-loss prevention filter on the HTTP API response path to redact sensitive tokens, credentials, and PII before output reaches consumers — counters the absence of output filtering.
- Configuration Configure an output content-type allowlist on the HTTP API response path to restrict emitted formats to plain text — counters arbitrary content injection through unfiltered API responses.
- Engineering Add content moderation on text-to-speech output to prevent the agent from speaking sensitive information aloud — counters the unfiltered TTS pipeline.
Monitoring
Monitoring captures what the agent did and surfaces anomalies for review.
- Policy Forward server logs to a centralized SIEM with alerts on shell command invocations, file writes, and outbound network requests — counters the absence of any audit trail.
- Configuration Implement tool-invocation logging with timestamps, arguments, and return values for forensic analysis — counters the lack of observability into agent behavior between interactions.
- Engineering Add memory-write audit logging to track what information the reasoning loop persists across sessions — counters the risk that an attacker who poisons memory entries can persist access across sessions undetected.
7 References
The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.
Selected Vulnerabilities
- OWASP Top 10 for LLM Applications Class-level risk taxonomy for prompt injection and tool abuse in LLM-powered agents
Selected Research
- MITRE ATLAS Adversarial threat landscape for AI systems covering prompt-driven command-and-control techniques
- OWASP Top 10 for Agentic Applications Class-level coverage of unauthorized agent access and tool misuse in autonomous AI agents
Vendor Documentation
- Leon Official Documentation Vendor docs describing agent architecture and deployment model
- Leon Architecture Architecture overview of client-to-NLU-to-brain execution flow and TTS/STT pipeline
- Leon HTTP API HTTP API docs showing query and action endpoints with API key auth
- Leon GitHub Repository Source repo documenting tools, memory, agentic execution, and skills architecture
- Leon Road to 2.0 Blog post on memory, context, agentic loop, and multi-provider LLM support
- leonai PyPI Package PyPI package listing built-in tools including run_command and file I/O
- Leon Vendor Homepage Vendor homepage confirming the open skill marketplace and auto-loading architecture that drives the Configuration surface score
Other Sources
- Leon GitHub Issues Issue tracker with no security-labeled issues at time of assessment
- Leon Skills Directory Repository listing 25+ built-in skills for search, productivity, and system utilities