1 Key Risks
The most critical security risks an operator inherits when deploying this agent in its documented default configuration. Box AI presents a constrained default risk surface with moderate input exposure through multiple channels but limited blast radius due to the absence of code execution and deployment capabilities.
2 AIRQ Scores
The four headline scores quantify how exposed the agent is, how damaging a successful attack would be, and how much the agent’s own controls reduce that risk. Box AI scores reflect a constrained content-scoped agent whose trifecta-complete configuration elevates attack surface above the raw per-component values.
Box AI lands in the Tight Operators quadrant with attack surface at 4.80, blast radius at 2.50, and defense controls at 7, indicating a low-blast agent with moderate exposure and partial vendor safeguards.
Each axis measures a distinct dimension of the agent's risk posture: attack surface out of 10, blast radius out of 10, defense controls out of 15, and the AIRQ composite out of 15.
| Metric | Score | Comments |
|---|---|---|
| AIRQ Score | 2.5 | Low composite reflecting a constrained agent where the trifecta floor drives attack surface above the raw component values. |
| Blast Radius | 2.5 / 10 | Low blast radius reflecting no code execution, no deployment access, and content operations scoped to the Box cloud platform. |
| Attack Surface | 4.8 / 10 | Moderate attack surface driven primarily by the trifecta floor applied when all three conditions converge on the default configuration. |
| Defense Controls | 7 / 15 | Partial defense posture with FedRAMP High platform controls but gaps in documented input classification and AI output inspection on the default configuration. |
3 Attack Surface
Attack surfaces are the entry points and interaction patterns through which adversarial input can reach the agent’s reasoning loop and steer its behavior. Box AI's reasoning loop processes user prompts and enterprise file content through RAG, with external access via MCP and Copilot extending the input surface beyond direct user interaction.
Higher scores indicate surfaces where adversarial input reaches the reasoning loop through more channels or with fewer documented controls.
Each row scores one of ten canonical attack surfaces, with the Comments column citing the agent-specific evidence anchoring the score.
| Surface | Score | Comments |
|---|---|---|
| User Input | 2 / 4 | Box AI accepts prompts up to 10,000 characters through four input channels documented in the API reference, with no prompt injection classifier on the default configuration [7]. |
| External Data | 2 / 4 | The RAG pipeline ingests file content up to 1 MB from user-accessible Box storage including files shared by external collaborators, relevant to the data poisoning patterns documented in class-level frameworks [4]. |
| Memory | 1 / 4 | Session-level context persists within AI Home work sessions, but the vendor states that prompts and outputs are deleted when the document or application closes [5]. |
| Reasoning | 2 / 4 | Pro Mode enables agentic loops with multi-step internal reasoning checks constrained to the declared task scope within user-accessible Box content [14]. |
| Planning | 2 / 4 | Multi-step planning within user-supervised sessions is documented with no autonomous background execution, cron scheduling, or unsupervised plan execution [14]. |
| Tool Execution | 1 / 4 | Tools are limited to AI Q&A, text generation, metadata extraction, search, and scoped file operations with no shell, code execution, or browser automation [7]. |
| Orchestration | 2 / 4 | Multi-step task execution occurs within user-initiated sessions using agentic loops without background daemons or cross-session autonomous orchestration [14]. |
| Inter-Agent | 1 / 4 | The Box MCP server enables external AI agents to access content through OAuth-authenticated connections that enforce the authenticated user's permission level [8]. |
| Output Processing | 1 / 4 | Box AI produces text-based responses with optional source citations and no rich rendering, auto-URL-fetch, or embedded executable content in its output [5]. |
| Configuration | 2 / 4 | Configuration changes require admin console access with a layered permission model where OAuth scopes, token permissions, and user-level content permissions must all align [9]. |
The Lethal Trifecta is triggered when an agent processes untrusted content, accesses private data, and communicates externally in the same session — the three conditions that turn an isolated prompt injection into full-chain exfiltration. Box AI reads untrusted file content from external collaborators, accesses enterprise data covered by FedRAMP High and HIPAA compliance, and sends that content to external LLM providers for inference on its default configuration.
Box AI exhibits all three of these conditions in its documented default configuration:
- Untrusted input — External collaborator files and MCP-relayed requests introduce adversarial bytes into the RAG reasoning loop without a documented prompt injection classifier [4].
- Sensitive data — The agent accesses all files the authenticated user can view, including enterprise content covered by FedRAMP High authorization for sensitive government data [12].
- External egress — File content and prompts leave the vendor's infrastructure boundary during LLM processing, while shared link management provides additional exposure paths [5].
4 Blast Radius
The blast radius is what an attacker who controls the agent can reach — which systems they touch, which credentials they read, and which actions they take without operator approval. Box AI's blast radius is constrained by the absence of code execution and deployment capabilities, with the primary damage surface limited to content exposure within the Box platform.
Higher blast scores indicate factors where a compromised agent reaches further into the operator's environment or triggers more damaging autonomous actions.
Each row ties a blast factor to the specific capability or access scope the agent holds, with Comments citing the evidence anchor.
| Factor | Score | Comments |
|---|---|---|
| Code execution | 0 / 4 | No shell, script, or browser automation capability exists; the vendor handles all model inference server-side through its own LLM infrastructure [7]. |
| File system access | 2 / 4 | The agent reads user-accessible files and creates documents in specified Box folders via Doc Gen, with historical research showing that Box shared links have exposed enterprise data at scale [8][13]. |
| Network access | 2 / 4 | Content crosses the Box infrastructure boundary when sent to external LLM providers for inference, and shared link brute-forcing has historically exposed enterprise documents [3][5]. |
| Credential access | 1 / 4 | The agent operates with user-scoped OAuth tokens requiring the ai.readwrite scope, and the underlying SDK has had a timing-based HMAC verification vulnerability [2][9]. |
| Autonomous action | 1 / 4 | Each interaction is user-initiated with no scheduled tasks, cron, or daemon operation documented in the agent's architecture [14]. |
| Deployment access | 0 / 4 | Box AI cannot deploy infrastructure, modify cloud resources, or publish packages; all operations are confined to the Box content management platform [7]. |
5 Defense Controls
Defense controls are what the agent’s own architecture does to detect, contain, and report attacks before they reach the operator’s systems. Box AI's defense posture combines FedRAMP High platform-level controls with vendor-documented event logging, but lacks publicly documented prompt injection detection and AI-specific output filtering on the default configuration.
Higher defense scores indicate stronger vendor-implemented safeguards that reduce the operator's hardening burden on the default configuration.
Each component is scored based on what the vendor implements by default versus what requires operator configuration or separate licensing.
| Component | Score | Comments |
|---|---|---|
| Input Guardrails | 1 / 3 | The vendor logs AI security detection events indicating some input monitoring exists, but no documented prompt shield runs by default; Box Shield content scanning is separately licensed [1][11]. |
| Execution Isolation | 2 / 3 | The agent runs within the vendor's multi-tenant cloud infrastructure assessed at FedRAMP High with SOC 2 Type II audit and contractual data deletion guarantees from LLM providers [6][12]. |
| Action Controls | 1 / 3 | A seven-role permission model constrains AI operations to the user's existing content access level, but no per-action approval gate for individual AI outputs is documented [9]. |
| Output Guardrails | 1 / 3 | No DLP or credential redaction is documented for AI outputs; the vendor's content scanning add-on covers file operations but not AI-generated text on the default configuration [11]. |
| Monitoring | 2 / 3 | Enterprise Events API provides near-real-time streaming with dedicated AI event types, and documented SIEM integration enables forwarding to external security operations platforms [10][15]. |
6 Hardening Tips
Concrete actions an operator can take to reduce the risks reported above, grouped by which defense control each tip strengthens. Operators should prioritize breaking the trifecta by restricting untrusted input to the RAG pipeline and adding output inspection before AI-generated content reaches downstream consumers.
Input Guardrails
Input guardrails intercept adversarial content before it reaches the reasoning loop.
- Policy Require all external collaborator uploads to pass through content scanning before AI processing to reduce untrusted content entering the RAG pipeline.
- Configuration Configure Box AI access to exclude folders containing externally shared content by restricting AI enablement to internal-only folder scopes in Admin Console.
- Engineering Deploy a prompt injection classifier as a preprocessing layer on Box AI API calls to detect adversarial patterns before they reach the model.
Execution Isolation
Execution isolation contains what a compromised agent can do on the host.
- Policy Require that all Box AI processing use customer-managed encryption keys via Box KeySafe to maintain cryptographic control over data sent to LLM providers.
- Configuration Enable Box Shield malicious content detection for all content types to add a scanning layer before files enter the AI reasoning pipeline.
- Engineering Implement a proxy between Box AI API endpoints and client applications to enforce additional input validation and rate limiting.
Action Controls
Action controls govern which tools and actions the agent can invoke autonomously.
- Policy Restrict Box AI Studio agent creation to a named list of approved administrators to prevent unauthorized custom agent proliferation.
- Configuration Disable Doc Gen document creation for all user roles except those with explicit business justification to limit autonomous write operations.
- Engineering Build an approval workflow using Box Relay that gates AI-generated document outputs through a human reviewer before they reach production folders.
Output Guardrails
Output guardrails inspect what the agent sends to other systems and users.
- Policy Require manual review of all AI-generated documents before external sharing to prevent sensitive content from leaking through AI-authored files.
- Configuration Configure classification labels on AI output folders and enable classification-based policies to trigger alerts when AI-generated content is shared externally.
- Engineering Integrate a DLP scanner on Box webhook events to inspect AI-generated text outputs for PII, credentials, and regulated data before downstream access.
Monitoring
Monitoring captures what the agent did and surfaces anomalies for review.
- Policy Require all Box AI event streams to be forwarded to the SIEM within 24 hours of Box AI enablement to close the monitoring blind spot.
- Configuration Enable admin_logs_streaming for near-real-time forwarding of AI security detection and AI user request events to the security operations center.
- Engineering Build automated alerting rules that flag anomalous Box AI usage patterns such as bulk extraction requests or unusual file access breadth.
7 References
The evidence base behind every score and finding in the profile, grouped by source type so the reader can verify any claim. Numbers in brackets throughout the report (e.g. [7, 13]) refer to entries below, listed in citation order.
Selected Vulnerabilities
- Box Vulnerability Reporting Policy Box partners with HackerOne for vulnerability disclosure but does not maintain a public list of externally reported issues
- AIKIDO-2025-10367 box-node-sdk timing vulnerability Observable timing discrepancy in HMAC verification in Box Node SDK 1.37.1-3.8.0; fixed in 3.8.1
Selected Research
- Pandora's Box shared link exposure Adversis demonstrated brute-forceable shared link URLs exposing enterprise data across hundreds of Box customers
- OWASP Top 10 for LLM Applications 2025 OWASP framework covering prompt injection and excessive agency relevant to RAG-based copilot architectures
Vendor Documentation
- Box AI Trust AI governance program with NIST AI RMF alignment and prompt/output deletion policy for LLM provider interactions
- Box Trust Center FedRAMP High and SOC 2 Type II and ISO 27001 certifications with enterprise risk management program
- Box AI developer documentation API endpoints for AI ask and text_gen and extract with ai_agent override system for per-request model customization
- Box MCP server tools MCP tools for AI Q&A and extraction and file operations enabling external AI agents to access Box content via OAuth
- Box API security model Layered permission model where application scopes and token permissions and user-level content permissions must all align
- Box Enterprise Events API Near-real-time event streaming with AI_SECURITY_DETECTION and BOX_AI_USER_REQUEST event types for SIEM integration
- Box Shield threat detection Separately licensed add-on with AI-powered malicious content scanning and anomaly detection for data theft
- Box FedRAMP High authorization FedRAMP High authorization in 2025 after assessment of 421+ security controls covering sensitive government data
Other Sources
- Box shared link data exposure coverage SiliconANGLE reported terabytes of data from 90+ companies exposed through publicly shared Box links
- Box Agent general availability announcement Box Agent with multi-step reasoning and Secure RAG grounding and AI Home work sessions
- Cortex XSIAM Box integration Palo Alto Cortex XSIAM Box SIEM integration for enterprise events and Box Shield alerts