1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Cleric presents moderate input risk from persistent unverified memory, elevated credential exposure from passthrough tokens, and monitoring that requires operators to build their own alerting pipeline.

Key Input Risks

Third-party alert payloads from integrated monitoring tools enter the Cleric reasoning loop as untrusted input on the default configuration. Persistent operational memory accumulates this context across sessions without documented integrity verification or expiration [1].

Key Execution Risks

The agent executes read-only diagnostic queries against operator infrastructure via passthrough credentials with no code sandbox or shell. No public red-team results or independent penetration findings exist for the execution boundary [6].

Key Action Risks

Background scanning triggers autonomous investigations without per-invocation operator approval on the default configuration. Credential passthrough grants access to operator API keys, IAM roles, and service account tokens across integrated platforms [6].

Key Output Risks

Investigation results are delivered as structured reports within the platform with no documented DLP or output redaction. The investigation output channel can surface sensitive infrastructure state to any operator with platform access [7].

Key Monitoring Risks

Vendor documents investigation audit trails and SOC 2 Type II operational monitoring for platform-level events. Operators must configure their own SIEM forwarding and anomaly detection for agent-level behavior — neither ships enabled by default [5].

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Cleric scores as a moderate-risk agent whose credential passthrough and persistent memory elevate attack surface above its otherwise constrained blast footprint.

AIRQ Metrics

AIRQ Score4.42

Blast Radius4.13

Attack Surface4.8

Defense Controls8

Cleric lands in Tight Operators with Attack Surface 4.80, Blast Radius 4.13, and Defense Controls 8, placing it below the attack median with above-average defensive posture.

Attack Surface is scored out of 10, Blast Radius out of 10, Defense Controls out of 15, and AIRQ composite out of 15.

Metric	Score	Comments
AIRQ Score	4.42	Moderate composite driven by credential exposure offset by read-only defaults and documented isolation.
Blast Radius	4.13 / 10	Credential passthrough dominates; code execution and deployment remain low from read-only posture.
Attack Surface	4.8 / 10	Memory at 3.0 is the sole high-scorer; trifecta-complete posture triggers the 4.8 floor above the raw weighted average.
Defense Controls	8 / 15	Vendor documents isolation and action controls; input filtering and output guardrails lack published detail.

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. Cleric's reasoning loop processes operator queries, third-party alert payloads, and persistent memory state as first-class input on every investigation cycle.

Attack Surface Metrics

User Input2

Tool Execution2

External Data2

Orchestration2

Memory3

Inter-Agent1

Reasoning2

Output Processing2

Planning2

Configuration2

Higher scores reflect surfaces where untrusted or unverified content enters the reasoning loop with broader scope or less operator visibility.

Each row scores one of ten canonical attack surfaces from 0 to 4 based on documented default exposure and evidence quality.

Surface	Score	Comments
User Input	2 / 4	Operator-authored natural-language queries enter the reasoning loop with no third-party marketplace or public-facing input channel documented [8].
External Data	2 / 4	Alert payloads and telemetry from integrated monitoring platforms enter via operator-configured connections [6].
Memory	3 / 4	Persistent operational memory retains learned investigation patterns indefinitely with no published integrity checks or time-based eviction controls [1].
Reasoning	2 / 4	Hypothesis-tree reasoning with confidence scoring constrains investigation paths; no published reasoning-loop manipulation resistance testing available [8].
Planning	2 / 4	Multi-step investigation plans generated from the hypothesis tree are operator-visible but not operator-gated before execution [8].
Tool Execution	2 / 4	Read-only diagnostic queries execute against operator infrastructure through passthrough credentials with no shell or code sandbox [6].
Orchestration	2 / 4	Single-agent orchestration within the platform; background scanning runs on schedule without per-invocation approval [4][10].
Inter-Agent	1 / 4	No documented multi-agent communication or delegation to external agent frameworks in the default configuration [7].
Output Processing	2 / 4	Structured investigation reports rendered within the platform with no documented deserialization of untrusted external content [7].
Configuration	2 / 4	Integration credentials and write-access toggles managed through the console with no documented least-privilege templates [6].

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Cleric ingests third-party alert payloads, reads private infrastructure telemetry via passthrough credentials, and connects outbound to operator cloud APIs on the default configuration.

Lethal Trifecta · Complete (3 of 3)

Cleric exhibits all three of these conditions in its documented default configuration:

Untrusted input — Third-party monitoring integrations deliver alert payloads authored by parties outside the operator's direct control into the reasoning loop [6].
Sensitive data — Passthrough credentials grant read access to private infrastructure telemetry, application logs, and cloud metadata scoped to operator IAM roles [6].
External egress — Outbound connections to operator cloud APIs and integrated monitoring platforms terminate at third-party endpoints outside the operator's network perimeter [6].

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. A compromised Cleric agent reaches operator cloud credentials and monitoring platform tokens but cannot execute arbitrary code or modify deployed infrastructure.

Blast Radius Metrics

Code execution1

Credential access3

File system access1

Autonomous action2

Network access2

Deployment access1

Higher blast scores indicate broader downstream impact when the agent's access is misused or its reasoning is compromised.

Each row maps one blast factor to the agent's documented default access scope and the workflow nodes that consume it.

Factor	Score	Comments
Code execution	1 / 4	No shell, browser, or sandbox execution documented; the agent issues read-only API calls rather than arbitrary code [6].
File system access	1 / 4	No local or remote file-system write capability documented; investigation artifacts stay within platform storage [7].
Network access	2 / 4	Outbound connections to operator cloud APIs and monitoring platforms for diagnostic queries with no arbitrary URL fetch [6].
Credential access	3 / 4	Credential passthrough exposes the full scope of configured integration tokens including cloud IAM bindings and platform service accounts [6].
Autonomous action	2 / 4	Background scanning triggers investigations on schedule; write operations require explicit operator opt-in before execution [7].
Deployment access	1 / 4	No documented capability to modify infrastructure, deploy code, or alter cloud resource configurations on the default posture [8].

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. Cleric publishes single-tenant isolation and read-only action defaults but leaves input filtering and output redaction undocumented on the default configuration.

Defense Controls Metrics

Input Guardrails1

Execution Isolation2

Action Controls2

Output Guardrails1

Monitoring2

Higher defense scores indicate stronger vendor-implemented safeguards that reduce operator hardening burden on the default deployment.

Each component is scored 0 to 3 based on whether the control is vendor-implemented, operator-configurable, or absent from documentation.

Component	Score	Comments
Input Guardrails	1 / 3	No documented prompt-injection filtering or input validation; operators must deploy their own classifier or WAF-layer filtering before alert payloads reach the agent [1][2].
Execution Isolation	2 / 3	Single-tenant dedicated infrastructure with SOC 2 Type II certified isolation boundaries between customer environments [1].
Action Controls	2 / 3	Read-only default posture with explicit opt-in for write operations; Generic MCP tools carry a vendor warning about unverifiable sources [3][6].
Output Guardrails	1 / 3	No documented DLP, output redaction, or URL sanitization for investigation reports delivered to operators [7].
Monitoring	2 / 3	Vendor provides per-investigation audit logging and compliance-certified operational oversight; real-time alerting and SIEM integration remain operator-managed [5].

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators should prioritize breaking the trifecta by restricting credential passthrough scope and gating autonomous investigation triggers.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails

Policy Require security review of all third-party monitoring integrations before connecting alert channels to the agent.
Configuration Configure integration-level input validation rules to reject malformed or oversized alert payloads at ingestion.
Engineering Deploy a prompt-injection classifier on inbound alert payloads before they reach the agent reasoning loop.

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation

Policy Restrict the agent's network egress to an explicit allowlist of operator-approved API endpoints per CSA control-plane guidance [9].
Configuration Enable VPC peering or private endpoints for all cloud API connections to eliminate public-internet traversal.
Engineering Instrument the execution boundary with request-level logging to detect anomalous diagnostic query patterns.

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls

Policy Mandate dual-approval gates for any transition from read-only to write-enabled integration access.
Configuration Disable the Generic MCP integration entirely unless the operator has independently verified the third-party tool.
Engineering Implement per-integration scope restrictions limiting credential passthrough to minimum required permissions.

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails

Policy Establish a policy requiring DLP scanning of all investigation reports before delivery to downstream consumers.
Configuration Configure output redaction rules to mask sensitive infrastructure identifiers in exported investigation results.
Engineering Wire a content-classification layer that flags investigation outputs containing credentials or secrets before delivery.

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring

Policy Require forwarding of all Cleric investigation audit trails to the operator's centralized SIEM platform.
Configuration Configure alerting thresholds for anomalous investigation frequency or unusual credential-access patterns.
Engineering Instrument the agent's API connections with request-level telemetry feeding into the operator's observability stack.

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

Cleric Security and Data Privacy Vendor-published security architecture documenting SOC 2 Type II certification, dedicated infrastructure isolation, encryption posture, and vulnerability remediation SLAs.

Selected Research

OWASP LLM01 Prompt Injection OWASP Top 10 for LLM Applications entry defining prompt injection as the top risk for LLM-integrated systems processing untrusted input with tool-calling capability.
OWASP LLM06 Sensitive Information Disclosure OWASP entry documenting how LLM-integrated systems can inadvertently expose sensitive data through output generation and tool-mediated access.
MITRE ATLAS Agentic AI Techniques Guide mapping fourteen MITRE ATLAS agentic attack techniques including Agent Context Poisoning and Exfiltration via Tool Invocation to SOC detection patterns.

Vendor Documentation

Cleric Trust Center Vendor trust center providing access to the SOC 2 Type II audit report, penetration test findings, and organizational security policies.
Cleric Integrations Overview Vendor documentation describing the credential model, read-only default posture, supported integrations, and the Generic MCP warning about unverifiable tools.
Cleric FAQ Vendor FAQ documenting single-tenant deployment, investigation audit trails, data non-persistence, and the read-only access guarantee with opt-in write exceptions.
Cleric Product Overview Vendor product page documenting the hypothesis-tree investigation workflow, read-only deployment model, and the operational memory architecture.

Other Sources

CSA Securing the Agentic Control Plane Cloud Security Alliance guidance on identity-first controls, runtime authorization, and continuous assurance requirements for autonomous AI agent ecosystems.
ZenML LLMOps Database Entry for Cleric Independent database entry documenting Cleric architecture including multi-layer knowledge graph, background scanning, confidence scoring, and security constraints.