Perplexity Computer Agent Security Risks

Computer Agents perplexity.ai Fortified Leaders
AI RISK QUADRANT POSITION DEFENSE CONTROLS (9) ATTACK SURFACE (5.75) EXPOSED GIANTS FORTIFIED LEADERS HUMBLE PROVIDERS TIGHT OPERATORS
AIRQ Score
6.2
Medium
Attack Surface
5.75
High
Blast Radius
5.38
High
Defense Controls
9
Medium
About The Agent

Perplexity Computer is a cloud-hosted multi-model orchestration platform that routes user workflows across nineteen specialized AI models while executing code, browsing the web, and connecting to external services through four hundred plus OAuth connectors within ephemeral Firecracker microVM sandboxes. The Personal Computer extension on Mac brings the same agentic capabilities to the local desktop with always-on daemon operation and persistent cross-session memory.

About the AI Risk Quadrant

Fortified Leaders agents combine a moderate-to-high attack surface with a moderate blast radius constrained by documented isolation controls. The risk shape emerges when strong execution sandboxing limits damage propagation but absent output-layer controls leave the exfiltration channel open once the reasoning loop is compromised through any of the wide input surfaces.

1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Perplexity Computer presents broad input ingestion risk offset by strong execution isolation, with absent output guardrails creating the primary exfiltration channel on default configuration.

Key Input Risks
Perplexity Computer ingests web-browsed pages, email bodies, calendar events, and documents from four hundred plus OAuth connectors on the default configuration. BrowseSafe ML detection scans content in parallel but was bypassed via indirect prompt injection in calendar invite ICS content [3][4].
Key Execution Risks
The agent executes arbitrary Python, JavaScript, and SQL inside Firecracker microVMs with dedicated kernels and seccomp filters; Trail of Bits audited the boundary. Independent reverse engineering confirmed no credentials are visible inside the sandbox environment [9][12].
Key Action Risks
Sub-agent spawning, code execution, web browsing, and long-running background tasks proceed autonomously without per-step approval on the default configuration. The confirm_action gate covers send, delete, and purchase operations but does not restrict connector-initiated workflows [10].
Key Output Risks
The agent emits text responses, generated files, and integration writes through connected services with no documented content-level DLP or credential redaction. CVE-2025-50708 and CVE-2025-50709 confirmed that session tokens and sensitive parameters leak through shared URLs without filtering [1][2].
Key Monitoring Risks
Enterprise audit logs with SIEM integration are available but gated behind the Enterprise tier; consumer and Pro subscribers have basic activity history only without structured audit trails or anomaly alerting [8].

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Perplexity Computer scores a moderate composite AIRQ of 4.00, reflecting strong isolation offset by absent output controls.

AIRQ Metrics

The agent lands in Fortified Leaders due to elevated attack surface above the midpoint threshold combined with blast radius held below the high-risk ceiling by sandbox isolation.

The four headline metrics capture the aggregate risk posture across all assessed dimensions.

Metric Score Comments
AIRQ Score 6.2 Moderate risk composite driven by isolation strengths offset against output-layer gaps.
Blast Radius 5.38 / 10 Firecracker sandbox caps code, filesystem, and network factors; autonomous action dominates.
Attack Surface 5.75 / 10 Confirmed token exposure CVE anchors output processing above base; remaining surfaces carry architectural bands.
Defense Controls 9 / 15 Execution isolation scores maximum; absent output guardrails create the primary defense gap.

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. The attack surface spans ten assessed dimensions with output processing elevated by CVE-2025-50708 evidence and all connector-facing surfaces carrying base-3 architectural exposure.

Attack Surface Metrics

Scores range from 2 (scoped execution with documented isolation) to 4.5 (CVE-confirmed token exposure on output path).

Each row scores the surface exposure, cites supporting evidence, and describes the specific architectural risk.

Surface Score Comments
User Input 3 / 4 Multiple input channels including web UI, Mac desktop daemon, voice, and four hundred plus connectors with BrowseSafe ML scanning external content before action [7][8].
External Data 3 / 4 Ingests web pages, emails, calendar events, and documents from connected services; PleaseFix demonstrated ML bypass via crafted calendar content [3][4][8].
Memory 3 / 4 Persistent three-level cross-session memory stores user preferences and project context without documented integrity verification on writes [8].
Reasoning 3 / 4 Multi-model architecture delegates reasoning to nineteen interchangeable external LLMs with routing decided by the orchestration layer [8].
Planning 3 / 4 Autonomous task decomposition spawns sub-agents without human review of the generated plan; sub-agents spawn additional sub-agents [10].
Tool Execution 2 / 4 Full Python, JavaScript, and SQL execution inside Firecracker microVM with scoped filesystem and egress proxy; no raw credentials reach sandbox [9].
Orchestration 3 / 4 Spawns sub-agents for parallel execution, runs background tasks for hours or months, and operates as a daemon on Personal Computer [10].
Inter-Agent 3 / 4 Connects to external services via four hundred plus OAuth connectors and MCP protocol; hidden MCP API in Comet extensions enabled system-level commands without user consent [8][13].
Output Processing 4.5 / 4 CVE-2025-50708 (CVSS 7.5) demonstrated token exposure in shared chat URLs; no content-level DLP documented on output path [1].
Configuration 2 / 4 Admin-configurable connectors and model restrictions via enterprise controls; UXSS via externally_connectable wildcard demonstrated cross-origin config exposure [5][8][11].

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Perplexity Computer ingests untrusted web and connector content, accesses OAuth-scoped sensitive records, and emits data through email, messaging, and browser egress channels on the default configuration.

Lethal Trifecta · Complete (3 of 3)

Perplexity Computer exhibits all three of these conditions in its documented default configuration:

  • Untrusted input — Web-browsed pages, calendar events via Gmail and Outlook connectors, and documents from four hundred plus OAuth-connected services reach the reasoning loop [8].
  • Sensitive data — OAuth-scoped connectors grant read access to email, files on Google Drive and Notion, customer records in Salesforce, and databases via Snowflake [8].
  • External egress — Default egress includes sending email via Gmail and Outlook connectors, posting messages via Slack and Discord, and browser navigation to arbitrary URLs [7][10].

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. The blast radius is constrained by Firecracker microVM isolation capping code, filesystem, and network factors while autonomous action capability drives the dominant risk.

Blast Radius Metrics

Scores range from 2 (scoped sandbox with documented boundaries) to 3 (autonomous multi-hour operation without mandatory approval gates).

Each row scores the damage potential for one impact factor given confirmed isolation controls and operational defaults.

Factor Score Comments
Code execution 2 / 4 Scoped shell inside Firecracker microVM with dedicated kernel, seccomp filters, and cgroup limits; no documented path to host [9].
File system access 2 / 4 Read-write scoped to ephemeral sandbox filesystem via FUSE mount; session data resets on completion [9].
Network access 2 / 4 Domain-restricted outbound via egress proxy; proxy injects credentials externally without exposing tokens to sandbox [9].
Credential access 2 / 4 OAuth tokens stored server-side; no secrets observed during sandbox enumeration; backend handles all token exchange externally [12].
Autonomous action 3 / 4 Tasks run for hours or months with optional check-ins; confirm_action gates sensitive operations and enterprise admins can restrict sub-agent spawning [10].
Deployment access 2 / 4 Can trigger deploys via Vercel and AWS connectors with confirm_action approval gate; no direct infrastructure modification [8].

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. Defense controls show strong execution isolation at maximum score with confident evidence, input guardrails and action controls at moderate band, and a complete absence of output-layer protection.

Defense Controls Metrics

Scores range from 0 (no documented output guardrails) to 3 (independently verified Firecracker microVM isolation).

Each row scores the defense maturity, cites the confidence tier, and describes the specific control mechanism documented.

Component Score Comments
Input Guardrails 2 / 3 BrowseSafe MoE classifier scans external content with four-layer defense architecture; Trail of Bits audited; continuously updated from bug bounty findings [6][7].
Execution Isolation 3 / 3 Firecracker microVM with dedicated kernel, seccomp, jailer process, cgroup limits, and network isolation; independently verified by reverse engineering [9][12].
Action Controls 2 / 3 confirm_action tool gates email send, message post, purchase, and delete; admin connector enable and disable per-org; no single-step bypass documented [7].
Output Guardrails 0 / 3 No documented content-level DLP, credential redaction, or URL sanitization on the output path; egress proxy restricts destinations but does not inspect content [7].
Monitoring 2 / 3 Enterprise audit logs with Splunk, Azure Sentinel, and Datadog integration; Panther SIEM internally; consumer tier has basic activity history only [8].

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Hardening recommendations target the absent output guardrails as the primary gap, strengthen input controls against demonstrated bypasses, and extend monitoring to consumer tiers.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails
  • Policy Restrict connector-ingested content types to plain text formats where workflow allows to reduce untrusted content reaching the reasoning loop.
  • Configuration Enable BrowseSafe strict mode and configure detection sensitivity thresholds upward to counter prompt injection bypass demonstrated in PleaseFix research.
  • Engineering Deploy an upstream content inspection proxy between connectors and the agent to scan calendar events and email attachments before ingestion.

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation
  • Policy Run Computer tasks in a dedicated VPC with additional network segmentation beyond the default Firecracker microVM isolation boundary.
  • Configuration Restrict sandbox filesystem writes to operator-designated paths rather than allowing full ephemeral write access to counter file-based persistence.
  • Engineering Limit per-session resource allocation and enforce timeout ceilings below the platform maximum to counter runaway autonomous execution.

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls
  • Policy Require explicit approval for all connector write operations beyond the default confirm_action set to gate autonomous sub-agent actions.
  • Configuration Disable unused connectors at the organization level through admin controls to reduce blast radius from dormant OAuth grants.
  • Engineering Require explicit approval gates for sub-agent creation and spawning operations to prevent unbounded autonomous delegation chains.

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails
  • Policy Deploy a content inspection layer on the egress proxy that scans outbound payloads for credentials, PII, and sensitive tokens.
  • Configuration Configure URL allow-listing on browser navigation targets and restrict link rendering in agent responses to counter exfiltration.
  • Engineering Implement content inspection on all agent-generated shared URLs scanning for session tokens, API keys, and PII before distribution.

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring
  • Policy Forward all agent audit logs to a centralized SIEM with automated alerting rules for anomalous connector access patterns.
  • Configuration Enable continuous session recording for Computer tasks that access sensitive connectors to counter blind spots in consumer-tier history.
  • Engineering Monitor for connector-specific abuse patterns and memory write anomalies to detect agent hijacking during long-running autonomous tasks.

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

  1. CVE-2025-50708 Token exposure in shared chat URL allows remote attacker to obtain sensitive session information without authentication (CVSS 7.5 HIGH). Published July 2025.
  2. CVE-2025-50709 Sensitive information leak via GET parameter allows authenticated attacker to obtain data through request parameters (CVSS 4.3 MEDIUM). Published September 2025.

Selected Research

  1. PleaseFix vulnerability family disclosure Zenity Labs disclosed three distinct exploit paths against Perplexity Comet including file exfiltration, credential theft via 1Password workflow manipulation, and full account takeover. Demonstrated March 2026.
  2. PerplexedBrowser zero-click file theft Zenity Labs demonstrated zero-click local file exfiltration via calendar invite indirect prompt injection on Perplexity Comet agentic browser. Rated P1 Critical on Bugcrowd. Patched February 2026.
  3. Perplexity Comet UXSS via extension Hacktron demonstrated one-click universal cross-site scripting via externally_connectable wildcard and DOMSnapshot access to file and cross-origin data. Patched August 2025 within 24 hours. Bounty of six thousand dollars awarded.
  4. BrowseSafe prompt injection detection Perplexity research publication documenting their open-source MoE-based prompt injection detection model for browser agents with defense-in-depth architecture.

Vendor Documentation

  1. How We Built Security Into Computer Vendor security architecture documentation covering Firecracker microVM sandboxing, OAuth connector data handling, BrowseSafe prompt injection defense, and enterprise governance controls. Published May 2026.
  2. Perplexity Trust Center Vendor trust center documenting SOC 2 Type II certification, GDPR compliance, AWS IAM access controls, SSO with MFA, JIT access policies, and Panther SIEM monitoring.
  3. Sandbox API documentation Vendor documentation of isolated code execution service describing Kubernetes pod isolation, FUSE filesystem, egress proxy with credential injection, and resource limits.
  4. Introducing Perplexity Computer Product launch announcement describing multi-model orchestration across nineteen models, sub-agent architecture, persistent memory, and four hundred plus connectors. Published February 2026.

Other Sources

  1. CVE-2025-0599 architectural analysis Deep analysis of CORS misconfiguration in Perplexity macOS Comet feature covering architectural risks of local web server binding and broader desktop AI agent security implications.
  2. Reverse engineering Perplexity Computer sandbox Independent reverse engineering confirming three-layer architecture (cloud backend, E2B Linux sandbox, separate cloud browser), no visible credentials in sandbox, and forty-plus built-in tools.
  3. Hidden MCP API in Comet extensions SquareX research exposed hidden MCP API in undisclosed Comet browser extensions allowing system-level commands without user consent. Perplexity silently patched after publication November 2025.