1 Key Risks

The most critical security risks an operator inherits when deploying this agent in its documented default configuration. ServiceNow AI Agents carry critical confirmed vulnerabilities on the tool execution and configuration surfaces alongside demonstrated prompt injection bypasses of default protections.

Key Input Risks

Untrusted content from record fields, external agent invocations, and user prompts reaches the reasoning loop through multiple channels. Independent research demonstrated second-order prompt injection succeeding with protection enabled, granting CRUD access to enterprise records and external email exfiltration.

Key Execution Risks

Agents execute within a sandbox environment where a confirmed vulnerability enabled unauthenticated remote code execution via improper isolation. Role masking and least-privilege scoping require deliberate configuration per agent workflow.

Key Action Risks

Agents perform record modifications, send external email, and trigger workflow automations scoped by the invoking identity. A confirmed privilege escalation vulnerability allowed unauthenticated attackers to command agents to create admin accounts.

Key Output Risks

Agent outputs include external email and API responses with regex-based sensitive data masking available but no documented channel-level blocking for outbound exfiltration. Independent research demonstrated record data exfiltration via email.

Key Monitoring Risks

Governance dashboards and audit log export provide observability over agent actions but real-time behavioral anomaly detection is not a default capability. Operators should configure SIEM correlation rules for abnormal CRUD velocity and cross-team agent invocations as a minimum monitoring baseline.

2 AIRQ Scores

The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. The composite score reflects a high attack surface driven by critical confirmed vulnerabilities partially offset by moderate vendor-implemented controls requiring deliberate configuration.

AIRQ Metrics

AIRQ Score5.95

Blast Radius5.5

Attack Surface6.62

Defense Controls8

Operators deploying ServiceNow AI Agents inherit a high attack surface from confirmed critical vulnerabilities in sandbox isolation and access control that outpaces the available vendor controls. Role masking, supervised execution, and prompt injection detection exist but demonstrated bypasses and default configuration weaknesses leave gaps that deliberate hardening must close before enterprise data flows through agent workflows.

Each score combines evidence from confirmed vulnerabilities, independent research, and vendor documentation to produce a composite risk posture.

Metric	Score	Comments
AIRQ Score	5.95	Moderate defense investments partially counterbalance a high attack surface with confirmed critical vulnerabilities against the platform itself, producing a composite in the lower band.
Blast Radius	5.5 / 10	Enterprise record access and external communication channels drive moderate blast with network and credential factors at the upper bands.
Attack Surface	6.62 / 10	Confirmed sandbox escape and platform access control failures push the composite into the high band alongside demonstrated prompt injection bypasses of vendor protections.
Defense Controls	8 / 15	Vendor documents role masking, supervised execution, prompt injection protection, and platform monitoring, but demonstrated bypasses limit the effective protection to the moderate band.

3 Attack Surface

Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. Two surfaces reach the adjusted ceiling driven by critical confirmed vulnerabilities in sandbox isolation and default platform configuration, with demonstrated prompt injection bypasses elevating the input and inter-agent surfaces.

Attack Surface Metrics

User Input4

Tool Execution5

External Data3

Orchestration3

Memory1

Inter-Agent4

Reasoning2

Output Processing2

Planning3

Configuration5

Two surfaces sit at the adjusted ceiling of 5.0; three additional surfaces carry evidence penalties from confirmed vulnerabilities or demonstrated attacks.

Each row maps one attack surface to the structural exposure documented from confirmed vulnerabilities, independent research, and vendor architecture.

Surface	Score	Comments
User Input	4 / 4	Multiple input channels including chat interfaces, API triggers, and protocol-layer invocations carry untrusted content into the reasoning loop. CVE-2025-12420 enabled unauthenticated agent interaction via a hidden AI topic that bypassed normal access controls. [1]
External Data	3 / 4	Agents ingest record field data populated by other users and external agents as second-order input, creating an injection vector where attacker-controlled values reach the reasoning loop without content validation. [9]
Memory	1 / 4	Session-scoped conversation context with no persistent cross-session memory; knowledge retrieval operates as read-only augmentation from a managed knowledge base. [9]
Reasoning	2 / 4	Multi-step reasoning via the AI Orchestrator with chain-of-thought planning is constrained to the configured workflow scope and available tool set. [9]
Planning	3 / 4	Autonomous task decomposition with delegation to subagents; supervised execution is configurable per tool but setting sn_aia.enable_usecase_tool_execution_mode_override to true bypasses per-tool approval gates. [10]
Tool Execution	5 / 4	Agents execute within a sandbox environment where CVE-2026-0542 (CVSS 9.2) demonstrated unauthenticated remote code execution via improper isolation, granting attackers code execution within the platform boundary. [2]
Orchestration	3 / 4	Multi-agent workflows coordinate through the AI Agent Orchestrator with event-driven triggers; role masking enforces per-tool ACLs but coordination pathways expand the transitive attack surface. [10]
Inter-Agent	4 / 4	Agent-to-agent discovery via the protocol layer allows agents to recruit other agents for privileged operations. CVE-2025-12420 exploitation used a hidden AI topic to recruit agents for admin account creation. [1] [5]
Output Processing	2 / 4	Virtual Agent output includes a Data Privacy handler for regex-based sensitive data masking before content reaches the language model, but no documented exfiltration-channel blocking operates on agent action outputs. CVE-2025-11449 demonstrated reflected XSS enabling arbitrary code execution in user browsers via crafted links. [3] [7]
Configuration	5 / 4	Default team grouping creates unintended collaboration pathways between agents with different privilege levels. CVE-2025-12420 (CVSS 9.3) exploited a hidden AI topic and default settings to enable full tenant compromise. [1] [6]

The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Record fields populated by external users feed the reasoning loop whose outputs route through unrestricted email and REST channels carrying enterprise data.

Lethal Trifecta · Complete (3 of 3)

ServiceNow AI Agents exhibits all three of these conditions in its documented default configuration:

Untrusted input — Record fields, A2A protocol invocations, and Virtual Agent prompts deliver attacker-controlled content into the reasoning loop. [4]
Sensitive data — Agents operate on enterprise records spanning incidents, HR cases, customer data, and configuration items scoped by user permissions. [1]
External egress — External email, outbound REST API calls, MCP server connections, and A2A invocations provide egress channels with no content-level DLP blocking. [4]

4 Blast Radius

The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. Network access and credential exposure drive the upper bands while the remaining factors occupy moderate positions reflecting sandbox containment and approval gate availability.

Blast Radius Metrics

Code execution2

Credential access3

File system access1

Autonomous action2

Network access3

Deployment access2

Two factors sit at the upper band of 3; no factors reach the maximum, reflecting partial containment from sandbox isolation and configurable approval gates.

Each factor maps a post-compromise capability to the specific platform mechanisms that scope or enable it.

Factor	Score	Comments
Code execution	2 / 4	Sandbox execution within the platform environment; CVE-2026-0542 demonstrated escape enabling code execution but scope remained within platform boundaries rather than the underlying host. [2]
File system access	1 / 4	Application-layer record access scoped by ACLs provides no raw file system interaction; agents read and write platform records rather than filesystem paths. [9]
Network access	3 / 4	External REST API calls, email send, MCP server connections, and A2A outbound invocations provide broad network access from agent actions without documented channel restrictions. [10]
Credential access	3 / 4	Agents operate with the invoking user's permissions or a dedicated AI user; CVE-2025-12420 demonstrated impersonation granting access to any entitled operations including credential-equivalent capabilities. [1] [11]
Autonomous action	2 / 4	Supervised execution mode is available per tool with the autonomous override disabled by default; the default posture is safe but operators must monitor for configuration drift that would remove approval gates. [10]
Deployment access	2 / 4	Agents trigger workflow automations including change management processes with approval gates; deployment access is indirect through workflow execution rather than direct infrastructure control. [9]

5 Defense Controls

Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. Role masking and supervised execution ship as configurable controls while prompt injection protection and sensitive data masking operate at reduced effectiveness against demonstrated attack techniques.

Defense Controls Metrics

Input Guardrails1

Execution Isolation2

Action Controls2

Output Guardrails1

Monitoring2

Higher defense scores indicate stronger vendor-implemented safeguards; the total reflects moderate protection requiring deliberate operator configuration to reach full effectiveness.

Each component is scored based on whether the vendor implements a control by default and whether independent testing has confirmed its effectiveness.

Component	Score	Comments
Input Guardrails	1 / 3	Prompt injection detection exists in configurable block or log mode but second-order injection via record field manipulation succeeded with protection enabled in a controlled test environment. [4]
Execution Isolation	2 / 3	Multi-instance logically single-tenant architecture with dedicated databases and sandbox execution environment; CVE-2026-0542 confirmed sandbox bypass is now patched. [9]
Action Controls	2 / 3	Supervised execution mode and role masking restrict agent permissions to the intersection of invoking user roles and admin-defined approved lists; autonomous override is disabled by default. [10]
Output Guardrails	1 / 3	Data Privacy handler provides regex-based sensitive data masking before language model calls but no documented DLP or exfiltration-channel blocking covers agent action outputs. [9]
Monitoring	2 / 3	AI Control Tower provides governance dashboards with audit log export capability for SIEM integration; behavioral anomaly detection requires supplemental tooling beyond what the platform provides by default. [8] [12]

6 Hardening Tips

Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Forcing supervised execution on all write tools, deploying outbound DLP on agent communication channels, and enabling prompt injection detection in blocking mode deliver the highest-leverage defense improvements.

Input Guardrails

Input guardrails intercept adversarial content before it reaches the reasoning loop.

Input Guardrails

Policy Require prompt injection protection in block mode for all production agent workflows — counters User Input at adjusted ceiling with demonstrated protection bypass.
Configuration Enable sensitivity detection with custom patterns covering enterprise-specific terminology and record field names — counters External Data at the upper band via second-order injection.
Engineering Deploy a dedicated prompt injection classifier upstream of the agent reasoning loop with independent detection logic — counters User Input where built-in protection was bypassed by independent research.

Execution Isolation

Execution isolation contains what a compromised agent can do on the host.

Execution Isolation

Policy Require role masking on all agentic workflows with minimum-privilege role sets reviewed quarterly — counters credential blast at the upper band.
Configuration Configure dedicated AI users with scoped roles instead of dynamic user inheritance for each agent workflow — counters Configuration at adjusted ceiling via impersonation risk.
Engineering Build a custom pre-execution hook that validates agent tool call parameters against an allowlist before sandbox dispatch — counters Tool Execution at adjusted ceiling where sandbox isolation was bypassed.

Action Controls

Action controls govern which tools and actions the agent can invoke autonomously.

Action Controls

Policy Require supervised execution mode for all tools that perform write operations or external communications — counters autonomous action risk.
Configuration Disable the autonomous override system property and implement configuration drift monitoring for security-critical settings — counters Configuration at adjusted ceiling.
Engineering Implement approval workflows for high-privilege agent actions including account creation and role assignment operations — counters credential access where impersonation grants full entitlements.

Output Guardrails

Output guardrails inspect what the agent sends to other systems and users.

Output Guardrails

Policy Require DLP controls on all outbound email and REST API calls initiated by AI agent workflows — counters Output Guardrails at the lower band with no channel blocking.
Configuration Configure Data Privacy patterns to cover all sensitive record fields accessed by agents including custom business-specific identifiers — counters Output Guardrails with regex-only masking.
Engineering Restrict agent email send capabilities to approved recipient domains with allowlist enforcement at the platform level — counters external egress demonstrated via email exfiltration.

Monitoring

Monitoring captures what the agent did and surfaces anomalies for review.

Monitoring

Policy Forward AI agent execution logs to SIEM with correlation rules for anomalous CRUD patterns and privilege escalation sequences — counters Monitoring where anomaly detection is not default.
Configuration Enable real-time alerting on agent-to-agent discovery events and cross-team invocations via platform audit configuration — counters Inter-Agent at adjusted ceiling.
Engineering Deploy behavioral analytics tooling for AI agent action sequences to detect deviation from established workflow baselines — counters Monitoring requiring supplemental tooling for anomaly detection.

7 References

The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.

Selected Vulnerabilities

CVE-2025-12420 Privilege escalation via user impersonation in ServiceNow AI Platform (CVSS 9.3); patched Oct 2025
CVE-2026-0542 Remote code execution in ServiceNow AI Platform sandbox (CVSS 9.2); patched Feb 2026
CVE-2025-11449 Reflected XSS via crafted link enables arbitrary script execution in authenticated user browsers within the ServiceNow AI Platform (CVSS 4.0); patched Oct 2025

Selected Research

When AI Turns on Its Team AppOmni demonstrates second-order prompt injection exploiting agent-to-agent discovery with protection enabled
ServiceNow AI Agents Can Be Tricked The Hacker News coverage of agent-to-agent prompt injection attack chain against ServiceNow Now Assist
ServiceNow patches critical AI platform flaw CyberScoop reveals that default team grouping and a hidden AI topic enabled unauthenticated admin account creation via the AI agent layer

Vendor Documentation

Security on the ServiceNow AI Platform ServiceNow Trust Center security page documenting platform controls and instance separation
ServiceNow AI Platform Compliance Compliance page documenting ISO 42001 and ISO 27001 and SOC 2 Type 2 and FedRAMP High
AI Agent security implementation Vendor documentation for AI Agent security controls including role masking and supervised execution
Access control enhancements for AI Agents Documentation on role masking as required configuration and supervised mode for agent tools

Other Sources

ServiceNow AI vulnerability let anyone be admin The Stack technical writeup detailing the CVE-2025-12420 exploitation chain via hidden AI topic
Agentic AI Security for ServiceNow AppOmni AgentGuard overview describing real-time monitoring needs for ServiceNow AI agent security