Rezolve.ai Agent Security Risks

Business Process Agents rezolve.ai Exposed Giants
AI RISK QUADRANT POSITION DEFENSE CONTROLS (6) ATTACK SURFACE (5.7) EXPOSED GIANTS FORTIFIED LEADERS HUMBLE PROVIDERS TIGHT OPERATORS
AIRQ Score
5.43
High
Attack Surface
5.7
High
Blast Radius
5.88
High
Defense Controls
6
High
About The Agent

Rezolve.ai is a cloud-hosted, multi-tenant agentic AI service desk that handles IT, HR, and shared-services workflows through Microsoft Teams, Slack, voice, web chat, and email channels. The platform orchestrates seven specialized AI agents via the A2A protocol and connects to over 250 enterprise integrations, with MCP tool governance defaulting to off and an admin-controlled Human Collaboration Board for approval escalation. Its key risk surface centers on multi-channel employee input ingestion, remote PowerShell endpoint execution, and Active Directory credential access, all without independently verified injection defenses.

About the AI Risk Quadrant

Exposed Giants occupy a region where moderate-to-high attack surface exposure exceeds the blast radius the agent can reach. Rezolve.ai lands here because its broad input channels, persistent cross-session memory, and 250+ integration tool surface produce an Attack Surface of 5.70 while its Blast Radius of 5.88 stays below the threshold for the highest-risk quadrant. Defense Controls at 6 out of 15 partially offset the exposure but remain vendor-documented without independent verification, leaving the operator responsible for hardening the gaps the vendor has not publicly tested.

1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Rezolve.ai presents a broad input surface across six employee-facing channels, executes remote scripts on endpoints, and lacks independently verified defenses on its default configuration.

Key Input Risks
Deploy a prompt-injection detection layer before production: Rezolve.ai accepts employee queries across six channels including Microsoft Teams, Slack, voice, web, email, and API without a documented prompt shield at default. The vendor claims pattern-based input scanning, but no independent adversarial testing confirms effectiveness [1].
Key Execution Risks
The platform executes remote PowerShell scripts on employee endpoints via Automation Studio and runs agentic workflows across seven specialized agents using a multi-LLM architecture. Vendor documentation claims sandboxed execution environments, but no public red-team results or independent sandbox testing have been published [2].
Key Action Risks
Require Human Collaboration Board approval for all Active Directory and endpoint operations: proactive device health monitoring, endpoint automation, and background workflows execute autonomously without per-action approval at default. The highest-privilege scope includes AD password-reset authority and stored API tokens for over 250 integrations.
Key Output Risks
The platform emits rich messages through Teams, Slack, email, and web chat channels, with an on-device DLP layer that redacts PII entities before data reaches external LLMs. No documented exfiltration channel blocking or URL sanitization exists for the output path, leaving downstream consumers exposed through rendered output.
Key Monitoring Risks
Rezolve.ai provides structured observability at the agent-interaction, workflow-execution, and MCP-tool-call levels with per-endpoint analytics [8]. No SIEM integration, behavioral anomaly detection, or automated incident-response capability is documented, leaving security operations without alerting on adversarial workflow manipulation.

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Rezolve.ai scores moderately across all four axes, reflecting broad agentic capability with vendor-documented but unverified defense controls.

AIRQ Metrics

Rezolve.ai lands in the Exposed Giants quadrant with an Attack Surface of 5.70, Blast Radius of 5.88, and Defense Controls of 6.

Attack Surface is scored out of 10, Blast Radius out of 10, Defense Controls out of 15, and AIRQ is the composite score out of 15.

Metric Score Comments
AIRQ Score 5.43 Moderate AIRQ reflects a balance between broad agentic capability and partial vendor-documented defenses lacking independent validation.
Blast Radius 5.88 / 10 Remote PowerShell endpoint execution, Active Directory credential access, and autonomous background workflows anchor the blast radius.
Attack Surface 5.7 / 10 Multi-channel input, persistent memory, and 250+ integration tools drive the attack surface, with all three trifecta conditions triggered.
Defense Controls 6 / 15 Vendor-documented controls including MCP default-off governance and Human Collaboration Board approvals lack independent verification [8].

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. Rezolve.ai exposes its reasoning loop to employee prompts, knowledge-base content, MCP tool outputs, and uploaded documents across six communication channels.

Attack Surface Metrics

Higher scores indicate more attacker-controlled input channels reaching the agentic reasoning loop without validated filtering or isolation gates.

Each row pairs an attack surface with its band score and analyst commentary citing the vendor-documented evidence that anchors it.

Surface Score Comments
User Input 3 / 4 Employees submit prompts through Teams, Slack, voice, web chat, email, and API channels; vendor claims pattern-based scanning but no independent testing validates injection resistance [6][7].
External Data 3 / 4 The platform ingests data from SharePoint, Confluence, and ServiceNow knowledge bases, uploaded documents, and MCP server outputs across 250+ integrations with no documented content validation gate [10][8].
Memory 3 / 4 Cross-session persistent memory dynamically shares and retains context across channels and tickets with automated learning loops; no integrity verification or poisoning detection is documented [6].
Reasoning 3 / 4 Multi-LLM model-agnostic architecture delegates reasoning to interchangeable external LLMs with partial transparency; no chain-of-thought visibility or alignment verification is published [6].
Planning 3 / 4 Seven specialized agents decompose tasks autonomously via the A2A protocol with Human Collaboration Board approval configurable but not mandatory for all actions [14].
Tool Execution 3 / 4 Automation Studio executes remote PowerShell scripts on endpoints and accesses 250+ enterprise integrations; MCP tools default to off with admin toggle-on governance [12][10].
Orchestration 3 / 4 Agentic workflows support multi-step execution with the A2A protocol, webhook integrations, and background automation including proactive device health monitoring [8].
Inter-Agent 3 / 4 The A2A protocol enables agent-to-agent collaboration and MCP server connections to external ecosystems without documented inter-agent authentication or message integrity verification [14].
Output Processing 2 / 4 An on-device DLP layer strips PII entities from prompts heading to external LLMs; no controls block exfiltration via markdown images, auto-fetch URLs, or redirect chains on the output side [9].
Configuration 2 / 4 Tool governance follows a default-deny model where MCP tools require admin activation; the admin console manages integrations centrally with no auto-loaded project config files [8].

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Rezolve.ai ingests employee prompts and knowledge-base content from untrusted sources, accesses Active Directory credentials and HR records, and communicates externally through six messaging channels and 250+ integration APIs [6].

Lethal Trifecta · Complete (3 of 3)

Rezolve.ai exhibits all three of these conditions in its documented default configuration:

  • Untrusted input — Employee prompts arrive through six channels and MCP tool outputs carry external content into the reasoning loop [6].
  • Sensitive data — The agent accesses Active Directory credentials, employee HR records via Workday, and organizational knowledge bases [10].
  • External egress — Agents send responses through Teams, Slack, email, web chat, voice, and outbound API calls to 250+ integrated services [10].

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. A compromised Rezolve.ai agent can execute PowerShell on remote endpoints, modify Active Directory access roles, and run autonomous background workflows.

Blast Radius Metrics

Higher blast scores indicate the agent can reach more sensitive organizational resources or execute more impactful actions after compromise.

Each row links a blast factor to its band and an explanation describing the organizational resource or capability the agent reaches.

Factor Score Comments
Code execution 3 / 4 Remote PowerShell script execution runs on employee devices through RMM integration or direct connection, granting user-level privileges on each managed endpoint [12][11].
File system access 2 / 4 File access is scoped through integration-level permissions across SharePoint, Confluence, and endpoint automation; no direct unrestricted file system access is documented [10].
Network access 2 / 4 Outbound API calls reach 250+ configured enterprise services; web search is restricted via admin-managed whitelist and blacklist domain controls [10].
Credential access 3 / 4 Stored credentials span Active Directory write authority for password resets and group membership changes, along with API tokens for over 250 integrated enterprise services [10].
Autonomous action 3 / 4 Proactive device health monitoring and endpoint automation execute autonomously; Human Collaboration Board provides optional approval escalation [11].
Deployment access 1 / 4 The agent can install software on endpoints through RMM integration but has no documented access to production infrastructure or cloud deployment services [10].

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. Rezolve.ai publishes MCP default-off governance, a Human Collaboration Board, and on-device PII DLP, but none carry independent verification on the default configuration.

Defense Controls Metrics

Higher defense scores indicate stronger vendor-implemented safeguards that reduce the operator's residual hardening burden.

Each component is scored based on what the vendor implements by default versus what the operator must configure or build independently.

Component Score Comments
Input Guardrails 1 / 3 Vendor blog claims input validation and sanitization filters with action-level guardrails for prompt injection defense; no independent adversarial testing or published benchmark results validate these claims [7][3].
Execution Isolation 1 / 3 SOC 2 Type II and ISO 27001 certify organizational controls with downloadable audit reports [4][5]; vendor claims sandboxed execution environments but publishes no specifics on container technology or network restriction [13].
Action Controls 2 / 3 A default-deny MCP governance model requires explicit admin activation per tool; the Human Collaboration Board gates sensitive operations with no documented bypass, though proactive endpoint automations skip per-action approval [8][14].
Output Guardrails 1 / 3 A configurable PII-redaction layer runs on-device with adjustable sensitivity and entity types; no documented controls address exfiltration via rendered output, embedded URLs, or redirect chains in agent responses [9].
Monitoring 1 / 3 Structured observability covers agent interactions, workflow execution, and MCP tool calls with per-endpoint analytics; no SIEM forwarding, behavioral anomaly detection, or automated incident response is documented [8].

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators should prioritize breaking the trifecta by adding injection detection, restricting autonomous endpoint actions, and forwarding agent logs to a SIEM.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails
  • Policy Require all employee-facing channels to pass through a centralized prompt-injection detection layer before reaching the agentic reasoning loop — counters unvalidated multi-channel input surface.
  • Configuration Configure the DLP layer to operate on both input and output paths with maximum sensitivity and no whitelisting exceptions — counters the input-only PII redaction gap.
  • Engineering Wire an ML-based prompt-injection classifier into the input pipeline and test it against published injection benchmarks before production deployment — counters the absence of independently tested injection resistance.

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation
  • Policy Mandate that all agentic workloads run in isolated containers with network restrictions and file system scoping documented in the internal security policy — counters the unspecified sandbox architecture.
  • Configuration Restrict endpoint automation PowerShell execution to a predefined allowlist of approved scripts and block ad-hoc command execution — counters the broad remote execution surface.
  • Engineering Deploy host-based intrusion detection on all endpoints where Rezolve.ai runs automation scripts to detect anomalous command patterns — counters the absence of runtime execution monitoring.

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls
  • Policy Mandate approval gates for every Active Directory write operation and credential-bearing integration call — counters the optional approval gap for high-privilege autonomous actions.
  • Configuration Disable proactive device health automation for sensitive endpoint groups until each automation template has been reviewed and approved — counters autonomous endpoint remediation without per-action approval.
  • Engineering Integrate approval workflow telemetry with the change management system to create an auditable record of every autonomous action — counters the absence of action-level audit trails.

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails
  • Policy Establish a policy requiring all agent output channels to block markdown image rendering, auto-fetch URLs, and redirect chains — counters the absence of exfiltration channel controls.
  • Configuration Enable output-side DLP scanning for all channels including Teams, Slack, and email with sensitivity set to maximum — counters the input-only PII redaction posture.
  • Engineering Build an output sanitization proxy that strips or rewrites URLs, embedded images, and external resource references before delivering agent responses — counters the unmitigated rendering injection surface.

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring
  • Policy Require forwarding of all agent interaction, workflow execution, and MCP tool call logs to the SIEM for centralized alerting — counters the absence of security event correlation.
  • Configuration Configure alerting thresholds for anomalous patterns such as bulk Active Directory modifications or unusual endpoint automation frequency — counters the absence of behavioral anomaly detection.
  • Engineering Instrument the agentic workflow pipeline with honeytokens and canary queries to detect prompt injection attempts and data exfiltration — counters the absence of adversarial activity detection.

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

  1. OWASP Top 10 for Agentic Applications Class-level framework covering agent goal hijack, tool misuse, identity abuse, memory poisoning
  2. OWASP Top 10 for LLM Applications Class-level framework covering prompt injection, insecure output handling, excessive agency

Selected Research

  1. Design Patterns for Securing LLM Agents against Prompt Injections Academic defense patterns for agentic AI prompt injection resistance (Debenedetti et al.)

Vendor Documentation

  1. Rezolve.ai Trust Center SOC 2 Type II, ISO 27001, GDPR, HIPAA certifications and security documentation
  2. Rezolve.ai Security and Compliance Downloadable ISO 27001 certificate, SOC 2 report, GDPR and HIPAA compliance letters
  3. Agentic AI Platform Multi-agent architecture, cross-session memory, MCP/A2A protocols, multi-LLM design
  4. Is Agentic AI Safe for Enterprise IT Vendor blog on prompt injection mitigations, DLP, sandboxed execution, red-teaming claims
  5. Agentic Studio MCP Governance MCP tool default-off governance, Human Collaboration Board, A2A orchestration
  6. AI Data Loss Prevention On-device PII detection and redaction, customizable entities, sensitivity controls
  7. Rezolve.ai Integrations 250+ out-of-the-box integrations with ITSM, HR, identity, enterprise SaaS platforms
  8. Endpoint Automations Agentless on-device support, RMM integration, proactive health monitoring
  9. Automation Studio Desktop automation with remote PowerShell execution, process automation workflows

Other Sources

  1. Rezolve.ai ISO 27001 Certification Press release confirming ISO 27001 certification by Shamkris Global
  2. Rezolve.ai Multi-Agent Human Collaboration A2A protocol details, Human Collaboration Board approval workflows, MCP governance