Behavioral Anomaly Detection for AI Agents: What Normal Looks Like

Anomaly detection without a baseline is just random alerting. To know that an agent is behaving strangely, you first need a concrete model of what it looks like when it behaves normally. For AI agents, building that baseline is harder than for traditional services, because the variability in agent behavior is higher and the feedback loop from "looks strange" to "confirmed malicious" is less established.

The Dimensions of Agent Behavior

A behavioral baseline for an AI agent captures four dimensions:

Call volume. How many tool calls does a normal instance of this agent type make per session? A support agent handling a customer inquiry might make 3–8 CRM reads and 1–2 ticketing writes per session. The 99th percentile might be 20 reads and 5 writes on a complex multi-issue ticket. An agent making 200 reads in a session is 10× outside the normal distribution.

Call sequence. Normal agents follow predictable call patterns. A document-processing agent reads a document, calls an extraction tool, writes to an output queue. A normal sequence might be: document-read, extraction-call, output-write, done. An anomalous sequence might be: document-read, CRM-list (unexpected), customer-record-read × 50 (unexpected volume), external-HTTP (policy denied). Each step in the sequence that deviates from the established pattern is a signal.

Resource access patterns. Which specific resources does a normal instance access? A support agent handling tickets for Customer A should only access Customer A's records. If the agent starts accessing records for Customer B, C, and D in the same session, that is a cross-customer access anomaly even if each individual read is within the permitted endpoint scope.

Denial patterns. How often does a normal agent hit policy denials? A well-scoped agent operating within its job description should almost never trigger a policy denial. If an agent starts hitting denials on endpoints it never tried before, the agent is probing its permission boundaries — which is itself a behavioral signal.

Building the Baseline

The practical approach is to run a new agent type in a monitoring-only mode for the first two weeks of production deployment, collecting full audit logs without enforcing behavioral alerts. At the end of the period, compute the distribution for each behavioral dimension: p50, p95, p99 for call volume; the set of observed call sequences; the observed resource access scope.

Set alert thresholds at 3× the p99 for volume metrics, and define a sequence anomaly rule that fires if the agent calls any endpoint not in the observed sequence set more than twice in a session. These thresholds are starting points, not final values. Tune them over the following weeks based on the false positive rate.

The Baseline Drift Problem

Baselines go stale. If you deploy a new tool that support agents now use for every ticket, the baseline from before that deployment will generate constant false positives for the new tool calls. Baseline maintenance needs to be part of your deployment process: when agent capabilities change significantly, queue a baseline re-establishment period.

More subtly, baselines can drift gradually if agents are slowly developing anomalous patterns that do not individually trigger alerts. An agent type that makes 5 CRM reads per session on average in January, 8 in February, and 12 in March may still be within the evolving normal distribution at any snapshot, but the trend is abnormal. Trend analysis over baseline history — flagging when the 30-day moving average is growing faster than expected — catches gradual drift that snapshot comparisons miss.

Alert Fatigue and Prioritization

The goal is not to generate the most alerts — it is to generate alerts where the signal-to-noise ratio is high enough that responders take action. In practice, most teams running AI agents should start with three high-confidence alert rules: (1) any policy denial on an endpoint the agent type has never tried before, (2) call volume more than 5× the p99 for the agent type, and (3) cross-customer resource access within a single session. These three rules catch most known attack patterns with low false positive rates, and can be tuned from there as the detection program matures.

Elena Vasquez is CTO & Co-Founder of Riptides. Questions: [email protected]

← Back to Blog