1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Decagon presents a trifecta-complete attack surface where untrusted customer input, sensitive CRM data, and outbound communication channels converge without operator-visible filtering on the default configuration.

Key Input Risks

Customer messages from chat widgets, inbound email, and voice transcription deliver attacker-authored bytes directly into the reasoning loop without operator-visible input filtering. Operators should evaluate enabling stricter bad actor detection tiers or deploying a pre-processing classifier since no operator-configurable allow/deny rule exists by default.

Key Execution Risks

The platform executes vendor-managed LLM inference with multi-model failover but publishes no sandbox boundary specification or kernel-level tenant isolation documentation. No independent red-team audit of the execution boundary has been publicly disclosed.

Key Action Risks

AI Actions write to CRM records, send outbound emails, and sync with Salesforce and Zendesk integrations without per-action operator approval on the default configuration. Credential access spans all OAuth-scoped integrations the operator configured during onboarding.

Key Output Risks

Responses route to external customers across chat, email, and voice with PII redaction via Google DLP but no published output-filtering schema for injected URLs or malicious payloads. The downstream consumer is always an external end-user outside the operator's trust boundary.

Key Monitoring Risks

Tamper-protected audit logs and Watchtower QA scoring are documented, but SIEM forwarding and real-time anomaly alerting are not published as default capabilities. The operator's blind spot is the absence of documented alert thresholds for prompt-injection detection.

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Decagon's composite risk score reflects the tension between a trifecta-elevated attack surface and constrained blast radius moderated by vendor-documented defenses.

AIRQ Metrics

AIRQ Score2.67

Blast Radius2.5

Attack Surface4.8

Defense Controls8

Decagon's attack surface of 4.80, blast radius of 2.50, and defense controls at 8 place it in the Tight Operators quadrant, meaning operators can deploy with targeted input-layer hardening rather than architectural redesign.

Each axis contributes independently: Attack Surface scales to 10, Blast Radius to 10, Defense Controls to 15, and the AIRQ composite normalizes across all three to 15.

Metric	Score	Comments
AIRQ Score	2.67	Low composite indicates that constrained blast radius and vendor defenses substantially offset the trifecta-elevated attack surface for hardening-ready operators.
Blast Radius	2.5 / 10	Blast concentrates in network egress to integrations, OAuth credential scope, and autonomous CRM writes with no code execution or deployment reach.
Attack Surface	4.8 / 10	All three conditions for the X-axis floor are met: untrusted customer input, sensitive CRM data access, and outbound email egress confirmed on default configuration.
Defense Controls	8 / 15	Vendor documents execution isolation and action controls at the managed-SaaS tier but publishes no input-guardrail schema or output-filtering specification.

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. Decagon's reasoning loop ingests customer messages, CRM records, and persistent memory context as first-class input across chat, email, and voice channels.

Attack Surface Metrics

User Input2

Tool Execution2

External Data2

Orchestration2

Memory3

Inter-Agent1

Reasoning2

Output Processing1

Planning2

Configuration1

Higher scores indicate surfaces where attacker-authored bytes reach the reasoning loop with less filtering; memory scores highest due to cross-session persistence without integrity controls.

Each row maps one input surface to its adjusted score and a one-line comment citing the documented behavior that determines exposure level.

Surface	Score	Comments
User Input	2 / 4	Customer messages from chat, email, and voice channels enter the reasoning loop as first-class prompt context with vendor-claimed bad actor detection as the sole documented filter. [2][4]
External Data	2 / 4	Knowledge base articles and CRM records are ingested via operator-configured connectors scoped to the tenant's integration set. [7]
Memory	3 / 4	Persistent User Memory writes cross-session context automatically from customer interactions without integrity verification or operator-visible poisoning controls. [6]
Reasoning	2 / 4	Multi-LLM inference with provider failover operates within the vendor's managed boundary with no published reasoning-loop constraint or chain-of-thought filtering. [8]
Planning	2 / 4	The platform selects among configured AI Actions and knowledge sources per conversation turn bounded by the operator's integration configuration. [7]
Tool Execution	2 / 4	AI Actions execute writes to connected systems within operator-configured tool boundaries with no shell, code sandbox, or browser execution surface. [7]
Orchestration	2 / 4	Multi-agent routing across specialized sub-agents uses vendor-managed orchestration with no published inter-agent authentication or message-signing. [3][10]
Inter-Agent	1 / 4	Agent-to-agent communication is internal to the platform's orchestration layer with no external MCP peer or third-party agent federation documented by default. [7]
Output Processing	1 / 4	Responses are rendered as plaintext chat, email body, or voice synthesis with PII redaction applied before delivery to external customers. [4]
Configuration	1 / 4	Platform configuration is managed through the vendor console with RBAC and SSO with no operator-facing config file, manifest, or IaC surface exposed. [4]

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Decagon ingests untrusted customer messages across three channels, reads sensitive CRM records and account data via OAuth integrations, and delivers bytes externally through email, chat, and voice responses.

Lethal Trifecta · Complete (3 of 3)

Decagon exhibits all three of these conditions in its documented default configuration:

Untrusted input — Untrusted end-user messages arrive via chat, email, and voice transcription, entering the reasoning loop on every conversation turn without operator-configurable filtering. [4]
Sensitive data — OAuth-scoped integrations grant access to customer records, order histories, and support tickets stored in Salesforce and Zendesk. [7]
External egress — Outbound email, chat responses to external customers, and CRM write-backs deliver bytes outside the operator's trust boundary without per-message approval. [6]

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. A compromised Decagon agent reaches connected CRM integrations and outbound communication channels but cannot execute code, access filesystems, or touch operator infrastructure.

Blast Radius Metrics

Code execution0

Credential access2

File system access0

Autonomous action2

Network access2

Deployment access0

Higher blast scores indicate broader downstream damage from a single compromised conversation session reaching more connected systems or autonomous actions.

Each row maps one blast factor to its score reflecting the maximum downstream reach a compromised agent session achieves on the default configuration.

Factor	Score	Comments
Code execution	0 / 4	No code sandbox, shell, or browser execution surface is documented; the platform operates as a managed conversation engine without operator-accessible runtime. [8]
File system access	0 / 4	No file read, write, or artifact generation capability is documented; knowledge ingestion occurs through API connectors rather than filesystem access. [7]
Network access	2 / 4	Outbound HTTP to connected integrations is scoped to operator-configured AI Actions with no arbitrary URL fetch or unrestricted web browsing documented. [7]
Credential access	2 / 4	OAuth tokens for CRM integrations and API keys for connected tools are stored within the platform's managed credential vault with short-lived JWT sessions. [4]
Autonomous action	2 / 4	AI Actions fire CRM writes, email sends, and ticket updates on conversation triggers without per-action operator approval; proactive outreach operates on schedules. [6]
Deployment access	0 / 4	No documented access to operator cloud infrastructure, IaC pipelines, or production deployment targets; the platform is a self-contained SaaS. [8]

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. The vendor publishes tenant isolation and per-tool permission scopes at the managed-SaaS tier but provides no public specification for input filtering or output content classification.

Defense Controls Metrics

Input Guardrails1

Execution Isolation2

Action Controls2

Output Guardrails1

Monitoring2

Higher defense scores indicate stronger vendor-implemented safeguards that reduce operator hardening burden; lower scores indicate operator-managed or undocumented controls.

Each component is scored based on what the vendor publishes as implemented by default versus what remains operator-managed or undocumented.

Component	Score	Comments
Input Guardrails	1 / 3	Vendor claims bad actor detection and a supervisor model but publishes no input-filtering schema, no prompt-injection classifier specification, and no operator-configurable rules. [1][4]
Execution Isolation	2 / 3	Per-tenant datastore isolation, rate limiting, and multi-region fault containment are documented with no kernel-level sandbox or network namespace specification published. [5][8]
Action Controls	2 / 3	AI Actions are operator-configured with per-tool toggles and integration-scoped permissions but no per-action human-in-the-loop approval gate on the default configuration. [7]
Output Guardrails	1 / 3	PII redaction via Google DLP integration is documented but no output-filtering schema for injected URLs, payload sanitization, or content classification is published. [4]
Monitoring	2 / 3	Tamper-protected audit logs, Watchtower QA scoring, and conversation analytics are documented but real-time alerting and SIEM forwarding are not published as defaults. [4][9]

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators should prioritize breaking the trifecta by gating outbound actions and deploying input classifiers, then layering monitoring to close the detection gap.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails

Policy Require manual review of all knowledge base updates before they enter the retrieval pipeline to prevent indirect prompt injection via poisoned articles.
Configuration Enable the vendor's bad actor detection at its strictest sensitivity tier and configure custom blocked-phrase lists for known injection patterns.
Engineering Deploy a pre-processing classifier on the ingestion path that scores inbound messages for prompt-injection probability before they reach the LLM.

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation

Policy Establish a policy requiring the vendor to disclose tenant isolation architecture changes and sandbox boundary specifications in shared-responsibility documentation.
Configuration Configure per-tenant rate limits to their minimum viable thresholds to constrain the blast radius of any single compromised conversation session.
Engineering Instrument the API gateway with request-level tracing to detect anomalous inference patterns that bypass the vendor's managed isolation boundary.

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls

Policy Require human-in-the-loop approval for all AI Actions that write to production CRM records or send outbound communications to external recipients.
Configuration Restrict AI Action scopes to read-only for all integrations during initial deployment, enabling write permissions only after operational validation.
Engineering Build a middleware proxy between AI Action output and downstream CRM APIs that enforces field-level write permissions and rate limits.

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails

Policy Establish a policy requiring all agent responses containing URLs or external links to pass through a URL reputation check before delivery.
Configuration Configure Google DLP redaction rules to cover financial identifiers, health records, and authentication tokens beyond the default PII entity set.
Engineering Deploy a post-processing content classifier that scans agent responses for injected payloads and suspicious URL patterns before customer delivery.

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring

Policy Require weekly review of Watchtower QA scores with escalation triggers for conversations scoring below the established quality threshold.
Configuration Configure webhook-based alerting on audit log events matching known attack signatures including repeated system prompt extraction attempts.
Engineering Forward all agent conversation logs and audit events to a centralized SIEM with correlation rules detecting prompt-injection campaigns across sessions.

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

Decagon Responsible Disclosure Program The vendor operates a responsible disclosure program at security@decagon.ai but has published no advisories to date, and no public CVEs exist for the platform.

Selected Research

OWASP Top 10 for LLM Applications Industry framework cataloging prompt injection, excessive agency, system prompt leakage, and sensitive information disclosure risks applicable to conversational AI agents processing untrusted customer input.
OWASP Top 10 for Agentic Applications Framework covering agent goal hijack, tool misuse, identity and privilege abuse, memory poisoning, and insecure inter-agent communication for tool-using autonomous agents.

Vendor Documentation

Decagon Security and Trust Vendor security page documenting RBAC, SSO integration with Okta and Microsoft Entra, short-lived JWT tokens, AES-256 encryption, PII redaction via Google DLP, bad actor detection, supervisor model, Watchtower QA, and tamper-protected audit logs.
Decagon DPA Security Annex Data processing agreement annex confirming SOC 2 Type II certification with Security Trust Service Criteria, annual audit cadence, and data minimization under shared responsibility model.
Proactive Agents Launch Product announcement documenting persistent User Memory with cross-session and cross-channel context, outbound voice capabilities, and Agent Workbench debugging assistant.
Decagon Integrations Platform integrations page documenting CRM sync with Salesforce and Zendesk Sunshine, AI Actions for customer-facing writes, email routing, MCP support, and self-serve API tool connections.
Decagon Infrastructure Resilience Engineering blog documenting multi-region cloud infrastructure, per-tenant throttling and datastore rate limits, multi-LLM provider failover, and fault-isolated subsystem architecture on Google Cloud.

Other Sources

Decagon AI Complete Guide Independent third-party review noting basic user roles per G2 reviewers, shallow audit log visibility, opaque AI decision layer, SOC 2 Type II confirmation, and HIPAA BAA availability.
The Agent Control Plane Venture research note identifying Decagon as an enterprise customer-experience agent platform trusted by Hertz, Chime, and Duolingo across chat, email, and voice channels.