1 Key Risks
The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Decagon presents a trifecta-complete attack surface where untrusted customer input, sensitive CRM data, and outbound communication channels converge without operator-visible filtering on the default configuration.
2 AIRQ Scores
The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Decagon's composite risk score reflects the tension between a trifecta-elevated attack surface and constrained blast radius moderated by vendor-documented defenses.
Decagon's attack surface of 4.80, blast radius of 2.50, and defense controls at 8 place it in the Tight Operators quadrant, meaning operators can deploy with targeted input-layer hardening rather than architectural redesign.
Each axis contributes independently: Attack Surface scales to 10, Blast Radius to 10, Defense Controls to 15, and the AIRQ composite normalizes across all three to 15.
| Metric | Score | Comments |
|---|---|---|
| AIRQ Score | 2.67 | Low composite indicates that constrained blast radius and vendor defenses substantially offset the trifecta-elevated attack surface for hardening-ready operators. |
| Blast Radius | 2.5 / 10 | Blast concentrates in network egress to integrations, OAuth credential scope, and autonomous CRM writes with no code execution or deployment reach. |
| Attack Surface | 4.8 / 10 | All three conditions for the X-axis floor are met: untrusted customer input, sensitive CRM data access, and outbound email egress confirmed on default configuration. |
| Defense Controls | 8 / 15 | Vendor documents execution isolation and action controls at the managed-SaaS tier but publishes no input-guardrail schema or output-filtering specification. |
3 Attack Surface
Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. Decagon's reasoning loop ingests customer messages, CRM records, and persistent memory context as first-class input across chat, email, and voice channels.
Higher scores indicate surfaces where attacker-authored bytes reach the reasoning loop with less filtering; memory scores highest due to cross-session persistence without integrity controls.
Each row maps one input surface to its adjusted score and a one-line comment citing the documented behavior that determines exposure level.
| Surface | Score | Comments |
|---|---|---|
| User Input | 2 / 4 | Customer messages from chat, email, and voice channels enter the reasoning loop as first-class prompt context with vendor-claimed bad actor detection as the sole documented filter. [2][4] |
| External Data | 2 / 4 | Knowledge base articles and CRM records are ingested via operator-configured connectors scoped to the tenant's integration set. [7] |
| Memory | 3 / 4 | Persistent User Memory writes cross-session context automatically from customer interactions without integrity verification or operator-visible poisoning controls. [6] |
| Reasoning | 2 / 4 | Multi-LLM inference with provider failover operates within the vendor's managed boundary with no published reasoning-loop constraint or chain-of-thought filtering. [8] |
| Planning | 2 / 4 | The platform selects among configured AI Actions and knowledge sources per conversation turn bounded by the operator's integration configuration. [7] |
| Tool Execution | 2 / 4 | AI Actions execute writes to connected systems within operator-configured tool boundaries with no shell, code sandbox, or browser execution surface. [7] |
| Orchestration | 2 / 4 | Multi-agent routing across specialized sub-agents uses vendor-managed orchestration with no published inter-agent authentication or message-signing. [3][10] |
| Inter-Agent | 1 / 4 | Agent-to-agent communication is internal to the platform's orchestration layer with no external MCP peer or third-party agent federation documented by default. [7] |
| Output Processing | 1 / 4 | Responses are rendered as plaintext chat, email body, or voice synthesis with PII redaction applied before delivery to external customers. [4] |
| Configuration | 1 / 4 | Platform configuration is managed through the vendor console with RBAC and SSO with no operator-facing config file, manifest, or IaC surface exposed. [4] |
The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Decagon ingests untrusted customer messages across three channels, reads sensitive CRM records and account data via OAuth integrations, and delivers bytes externally through email, chat, and voice responses.
Decagon exhibits all three of these conditions in its documented default configuration:
- Untrusted input — Untrusted end-user messages arrive via chat, email, and voice transcription, entering the reasoning loop on every conversation turn without operator-configurable filtering. [4]
- Sensitive data — OAuth-scoped integrations grant access to customer records, order histories, and support tickets stored in Salesforce and Zendesk. [7]
- External egress — Outbound email, chat responses to external customers, and CRM write-backs deliver bytes outside the operator's trust boundary without per-message approval. [6]
4 Blast Radius
The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. A compromised Decagon agent reaches connected CRM integrations and outbound communication channels but cannot execute code, access filesystems, or touch operator infrastructure.
Higher blast scores indicate broader downstream damage from a single compromised conversation session reaching more connected systems or autonomous actions.
Each row maps one blast factor to its score reflecting the maximum downstream reach a compromised agent session achieves on the default configuration.
| Factor | Score | Comments |
|---|---|---|
| Code execution | 0 / 4 | No code sandbox, shell, or browser execution surface is documented; the platform operates as a managed conversation engine without operator-accessible runtime. [8] |
| File system access | 0 / 4 | No file read, write, or artifact generation capability is documented; knowledge ingestion occurs through API connectors rather than filesystem access. [7] |
| Network access | 2 / 4 | Outbound HTTP to connected integrations is scoped to operator-configured AI Actions with no arbitrary URL fetch or unrestricted web browsing documented. [7] |
| Credential access | 2 / 4 | OAuth tokens for CRM integrations and API keys for connected tools are stored within the platform's managed credential vault with short-lived JWT sessions. [4] |
| Autonomous action | 2 / 4 | AI Actions fire CRM writes, email sends, and ticket updates on conversation triggers without per-action operator approval; proactive outreach operates on schedules. [6] |
| Deployment access | 0 / 4 | No documented access to operator cloud infrastructure, IaC pipelines, or production deployment targets; the platform is a self-contained SaaS. [8] |
5 Defense Controls
Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. The vendor publishes tenant isolation and per-tool permission scopes at the managed-SaaS tier but provides no public specification for input filtering or output content classification.
Higher defense scores indicate stronger vendor-implemented safeguards that reduce operator hardening burden; lower scores indicate operator-managed or undocumented controls.
Each component is scored based on what the vendor publishes as implemented by default versus what remains operator-managed or undocumented.
| Component | Score | Comments |
|---|---|---|
| Input Guardrails | 1 / 3 | Vendor claims bad actor detection and a supervisor model but publishes no input-filtering schema, no prompt-injection classifier specification, and no operator-configurable rules. [1][4] |
| Execution Isolation | 2 / 3 | Per-tenant datastore isolation, rate limiting, and multi-region fault containment are documented with no kernel-level sandbox or network namespace specification published. [5][8] |
| Action Controls | 2 / 3 | AI Actions are operator-configured with per-tool toggles and integration-scoped permissions but no per-action human-in-the-loop approval gate on the default configuration. [7] |
| Output Guardrails | 1 / 3 | PII redaction via Google DLP integration is documented but no output-filtering schema for injected URLs, payload sanitization, or content classification is published. [4] |
| Monitoring | 2 / 3 | Tamper-protected audit logs, Watchtower QA scoring, and conversation analytics are documented but real-time alerting and SIEM forwarding are not published as defaults. [4][9] |
6 Hardening Tips
Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators should prioritize breaking the trifecta by gating outbound actions and deploying input classifiers, then layering monitoring to close the detection gap.
Input Guardrails
Input guardrails intercept adversarial content before it reaches the reasoning loop.
- Policy Require manual review of all knowledge base updates before they enter the retrieval pipeline to prevent indirect prompt injection via poisoned articles.
- Configuration Enable the vendor's bad actor detection at its strictest sensitivity tier and configure custom blocked-phrase lists for known injection patterns.
- Engineering Deploy a pre-processing classifier on the ingestion path that scores inbound messages for prompt-injection probability before they reach the LLM.
Execution Isolation
Execution isolation contains what a compromised agent can do on the host.
- Policy Establish a policy requiring the vendor to disclose tenant isolation architecture changes and sandbox boundary specifications in shared-responsibility documentation.
- Configuration Configure per-tenant rate limits to their minimum viable thresholds to constrain the blast radius of any single compromised conversation session.
- Engineering Instrument the API gateway with request-level tracing to detect anomalous inference patterns that bypass the vendor's managed isolation boundary.
Action Controls
Action controls govern which tools and actions the agent can invoke autonomously.
- Policy Require human-in-the-loop approval for all AI Actions that write to production CRM records or send outbound communications to external recipients.
- Configuration Restrict AI Action scopes to read-only for all integrations during initial deployment, enabling write permissions only after operational validation.
- Engineering Build a middleware proxy between AI Action output and downstream CRM APIs that enforces field-level write permissions and rate limits.
Output Guardrails
Output guardrails inspect what the agent sends to other systems and users.
- Policy Establish a policy requiring all agent responses containing URLs or external links to pass through a URL reputation check before delivery.
- Configuration Configure Google DLP redaction rules to cover financial identifiers, health records, and authentication tokens beyond the default PII entity set.
- Engineering Deploy a post-processing content classifier that scans agent responses for injected payloads and suspicious URL patterns before customer delivery.
Monitoring
Monitoring captures what the agent did and surfaces anomalies for review.
- Policy Require weekly review of Watchtower QA scores with escalation triggers for conversations scoring below the established quality threshold.
- Configuration Configure webhook-based alerting on audit log events matching known attack signatures including repeated system prompt extraction attempts.
- Engineering Forward all agent conversation logs and audit events to a centralized SIEM with correlation rules detecting prompt-injection campaigns across sessions.
7 References
The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.
Selected Vulnerabilities
- Decagon Responsible Disclosure Program The vendor operates a responsible disclosure program at security@decagon.ai but has published no advisories to date, and no public CVEs exist for the platform.
Selected Research
- OWASP Top 10 for LLM Applications Industry framework cataloging prompt injection, excessive agency, system prompt leakage, and sensitive information disclosure risks applicable to conversational AI agents processing untrusted customer input.
- OWASP Top 10 for Agentic Applications Framework covering agent goal hijack, tool misuse, identity and privilege abuse, memory poisoning, and insecure inter-agent communication for tool-using autonomous agents.
Vendor Documentation
- Decagon Security and Trust Vendor security page documenting RBAC, SSO integration with Okta and Microsoft Entra, short-lived JWT tokens, AES-256 encryption, PII redaction via Google DLP, bad actor detection, supervisor model, Watchtower QA, and tamper-protected audit logs.
- Decagon DPA Security Annex Data processing agreement annex confirming SOC 2 Type II certification with Security Trust Service Criteria, annual audit cadence, and data minimization under shared responsibility model.
- Proactive Agents Launch Product announcement documenting persistent User Memory with cross-session and cross-channel context, outbound voice capabilities, and Agent Workbench debugging assistant.
- Decagon Integrations Platform integrations page documenting CRM sync with Salesforce and Zendesk Sunshine, AI Actions for customer-facing writes, email routing, MCP support, and self-serve API tool connections.
- Decagon Infrastructure Resilience Engineering blog documenting multi-region cloud infrastructure, per-tenant throttling and datastore rate limits, multi-LLM provider failover, and fault-isolated subsystem architecture on Google Cloud.
Other Sources
- Decagon AI Complete Guide Independent third-party review noting basic user roles per G2 reviewers, shallow audit log visibility, opaque AI decision layer, SOC 2 Type II confirmation, and HIPAA BAA availability.
- The Agent Control Plane Venture research note identifying Decagon as an enterprise customer-experience agent platform trusted by Hertz, Chime, and Duolingo across chat, email, and voice channels.