Anomaly Detection & UEBA

Defensive AI

Statistical and ML-based approaches to spotting deviations from baseline behavior in users, entities, and network traffic.

Traditional security detection is built around known threats. A signature matches a file hash. A rule fires when a sequence of events matches a known attack pattern. These mechanisms work well against adversaries who use the same techniques across many targets, but they fail reliably against two categories of threat: insiders who are misusing legitimate access they already possess, and external attackers who have compromised valid credentials and are operating within normal-looking parameters. Anomaly detection takes a fundamentally different approach. Rather than asking "does this match a known bad pattern," it asks "does this deviate from established normal behavior?" The answer does not require knowing what the attacker is doing. It only requires knowing what the user or system normally does, and noticing when that changes.

What you'll learn

Key takeaways from this topic.

Explain the conceptual difference between signature-based detection and behavioral anomaly detection, and why both are necessary.
Describe how UEBA builds behavioral baselines and uses them to surface insider threats and compromised accounts.
Recognize the practical limitations of anomaly detection and understand how to use it alongside other controls.

At a glance

Fast mental model before you dive in.

Core concepts

Behavioral baselines
Risk scoring
Insider threat detection

Techniques

Statistical anomaly detection
Peer group analysis
ML-based pattern recognition

Tools

Microsoft Sentinel UEBA
Exabeam
Securonix

Core idea

The insight behind behavioral anomaly detection is that attackers, regardless of their sophistication, have to do things. They have to access files, authenticate to systems, move data, and communicate with external infrastructure. If they are using compromised credentials, they are doing those things as that user. And the user has a history. The user logs in at certain times from certain locations, accesses certain file shares, communicates with certain systems, generates certain amounts of network traffic. When the behavior changes significantly enough, the change is detectable even if the specific actions taken do not match any known attack signature.

This is why anomaly detection is particularly valuable against two threat categories that signature systems handle poorly. The first is the malicious insider: an employee who already has legitimate access to systems and data, whose actions generate the same kind of log entries as normal business activity. There is no malware to detect, no external C2 traffic, no unusual process execution. There is just a trusted employee doing things that look superficially like work, except that the scale, timing, or pattern has changed in ways that suggest something other than normal business activity. The second is the attacker who has compromised valid credentials, often through phishing or credential stuffing, and is now operating within the network as that user. Again, no signatures match, no rules fire, because the authentication was legitimate. What is detectable is that the stolen credentials are being used in ways the real user never used them.

UEBA (User and Entity Behavior Analytics) formalizes this approach into a systematic capability. It builds behavioral baselines for users, endpoints, applications, and other entities, tracks deviations from those baselines, and produces risk scores that help analysts focus on the people and systems whose behavior has changed most significantly.

How it works

The baseline-building process is the foundation. UEBA ingests data from identity systems, endpoint telemetry, network logs, cloud access logs, and application activity records, building a statistical model of what each user and entity normally does. This model is not static. It updates continuously as behavior changes over time, accounting for things like role changes, project cycles, and seasonal patterns. A salesperson who always accesses the customer database during business hours should not receive a high anomaly score on the Monday after quarter-end when they are pulling reports more intensively than usual. A well-tuned UEBA system understands contextual patterns and scores deviations relative to what is normal for that specific user at that specific time.

Anomaly detection algorithms range from relatively simple statistical methods to complex machine learning models. At the simpler end: tracking the mean and standard deviation of an entity's behavior over a rolling window and flagging observations that fall beyond a threshold number of standard deviations from the mean. At the more complex end: clustering algorithms that group users with similar roles and behavioral profiles into peer groups, then flag when a user's behavior diverges significantly from their peer group even if it does not exceed any individual user threshold. A finance analyst whose behavior suddenly resembles that of a system administrator should score high even if neither profile is intrinsically suspicious.

Risk scoring synthesizes multiple weak signals into a coherent assessment. A single anomaly, such as a first-time login to a new system, generates a low-risk score on its own: first-time events are common. But if that same user also downloaded an unusually large volume of files in the same session, and the login came from a location they have never used before, and the session occurred outside their normal working hours, the combination of weak signals produces a high-risk score that surfaces the case for investigation.

Real-world impact

The insider threat numbers justify the investment. Ponemon Institute's 2024 Cost of Insider Threats report put the average annual cost of insider-related incidents at $16.2 million per organization, with incidents involving malicious insiders averaging $701,500 per incident. By 2026, Ponemon and DTEX's joint Cost of Insider Risks Global Report found average annual costs had risen to $19.5 million per organization across the full insider threat category. More than 70% of insider threat incidents involve employees who have been with the organization for more than a year, which is precisely the profile where behavioral baselining is most effective: the system has had time to build a detailed picture of normal behavior.

The compromised-credential scenario is equally concrete. IBM's X-Force Threat Intelligence Index consistently identifies credential abuse as a top initial access vector, and the 2025 edition found that valid credentials were used in around 30% of all intrusions. Detecting that abuse through behavioral analysis rather than waiting for signature matches gives defenders the chance to contain the incident before the attacker has established deeper persistence.

Warning signs

Patterns worth investigating further.

A user account accesses systems, files, or data repositories that the account has never touched in its entire history, and the access does not coincide with any known business event like a role change or project assignment.
An entity's data transfer volume spikes significantly above its historical baseline, particularly when the destination is cloud storage, external email, or removable media.
Authentication events occur from geographic locations or at times of day that are inconsistent with the account's established usage pattern, without any corresponding travel or remote-work notification.

DEEP DIVE

▾

Why signatures alone are insufficient

Signature-based detection is conceptually straightforward: collect a fingerprint of known-bad activity, compare all observed activity against that fingerprint, alert on matches. This model works extremely well when the attacker uses known malware, known infrastructure, or known exploitation techniques. It fails completely when none of those conditions are met.

The failure is not a technical limitation so much as a logical one. A signature can only match something it has seen before, or something sufficiently similar to what it has seen before. Novel malware families with no prior samples produce no signature matches. Living-off-the-land attacks that exclusively use legitimate Windows tools leave no malware footprint whatsoever. A compromised account logging in with valid credentials and doing things the account is authorized to do generates no detection events at all on a signature-only system.

This is not a hypothetical gap. The 2024 CrowdStrike Global Threat Report found that the average breakout time for eCrime actors, the time between initial access and lateral movement, had fallen to 62 minutes. The fastest observed breakout time was two minutes and seven seconds. In that window, a system that waits for signature matches to fire before generating any alert cannot respond in time even if the attacker is eventually detected. An anomaly detection system that has been watching baseline behavior for weeks or months may surface unusual activity from the compromised account within the first few minutes of unauthorized use, long before any lateral movement has occurred.

Behavioral baseline construction

Building a useful behavioral baseline is harder than it sounds, and the quality of the baseline determines the quality of the detection. The most common failure mode is baselining over too short a window, which captures recent behavior but not the full range of legitimate activity. An analyst who works a project that requires access to a server they normally never touch, then returns to their normal work, should have that server access incorporated into their baseline. If the baseline window is only 30 days and the project was 60 days ago, the next time they access that server it will score as anomalous even though it is entirely legitimate.

The second common failure mode is failing to account for peer groups. Individual baselines are powerful but incomplete, because some legitimate behavior is rare for any individual but common within a role. A newly hired security analyst might access threat intelligence platforms, malware sandboxes, and vulnerability scanning tools from day one, even though none of those appear in their personal baseline yet. If the system flags all of this as anomalous, the analyst's first month generates constant false-positive noise that undermines confidence in the system. Peer group baselines, where the comparison is against other security analysts rather than against the individual's own history, handle this correctly from the start.

The third factor is seasonal and cyclical behavior. Finance teams behave very differently during reporting periods than at other times. Sales teams work differently around quarter-end. IT teams behave differently when running a major infrastructure project. A UEBA system that treats any deviation from a flat statistical mean as anomalous will generate elevated noise during every normal business cycle. Good implementations learn these cycles and adjust expectations accordingly.

Risk scoring methodology

Risk scoring is the mechanism that converts behavioral anomalies into analyst-actionable intelligence. The simplest approach is a binary threshold: if a metric exceeds N standard deviations from the mean, generate an alert. This is easy to implement and easy to explain, but it misses the most important threat patterns, which typically involve multiple weak signals rather than a single large deviation.

The more sophisticated approach assigns a risk contribution to each anomalous observation, with the contribution weighted by how significant the deviation is, how rarely that type of deviation occurs in the population, and how much the deviation aligns with known threat actor behavior. These weighted contributions aggregate into a risk score for the entity over a rolling time window. The score rises as anomalies accumulate and falls as behavior returns to baseline. The analyst sees not a binary alert but a prioritized queue of entities with elevated risk scores, each accompanied by the specific anomalies that contributed to the score.

Microsoft Sentinel UEBA illustrates the dual-score approach well. Each anomalous event receives both an investigation priority score and an anomaly score, and the two are deliberately kept separate. A first-time Azure operation has a high investigation priority because it is a first-time event worth understanding, but a low anomaly score because first-time events for a user transitioning to cloud are extremely common. An unusual lateral authentication from a service account has a low investigation priority if taken alone, but a high anomaly score because service accounts almost never authenticate laterally. By surfacing both dimensions, the system helps analysts understand whether they are looking at something that simply needs to be understood or something that is genuinely statistically unusual.

UEBA in practice: insider threat investigation

An insider threat investigation driven by UEBA typically proceeds through several stages. The first alert might be relatively low-confidence: a user downloaded a larger volume of files than their baseline suggested was normal on a Friday afternoon. That alone does not warrant immediate escalation. But the UEBA system continues monitoring, and over the following week it accumulates additional observations: the user is authenticating to systems outside their normal scope, sending emails to personal email accounts, and accessing HR-related files they have no obvious business reason to view. Each event is explainable individually. Together, they form a pattern that rises above the investigation threshold.

The analyst who receives the case does not start from scratch. The UEBA system has assembled the timeline of anomalous events, highlighted the specific deviations from baseline, and cross-referenced the behavior against the user's employment status (in this case, they gave notice two weeks ago, which was also in the data from the HR system). The analyst can evaluate whether the observed behavior constitutes a policy violation or a security incident, engage HR and legal as appropriate, and take the necessary technical containment actions, all informed by a complete behavioral history that would have taken hours to reconstruct manually.

This is the UEBA value proposition stated plainly: it does not catch the insider threat the moment they do something wrong. It builds the evidentiary record that makes the investigation both possible and defensible, and it surfaces the case at a threshold where human investigation can make a difference, before the damage is done.

Limitations and failure modes

UEBA is a powerful tool with genuine limitations that practitioners need to understand clearly. The first and most important is the dependency on data quality. UEBA is only as good as the logs it ingests. If authentication logs are incomplete, if cloud access events are not forwarded to the SIEM, or if endpoint telemetry is missing from a significant portion of the fleet, the behavioral baseline is built on a partial picture. Gaps in the data mean gaps in the baseline, and gaps in the baseline mean that the attacker can operate in those dark spaces without generating anomaly scores.

The second limitation is the persistence of learned behavior. A sophisticated insider who operates just inside their normal behavioral boundaries over a long period gradually shifts their baseline toward the malicious activity. A data thief who starts by downloading slightly more than usual, waits for that to become the new baseline, then downloads slightly more again, can avoid anomaly detection through incremental normalization. This is why UEBA is a complement to data loss prevention controls, access reviews, and personnel risk indicators, not a replacement for them.

The third limitation is the challenge of new users and accounts. There is no meaningful baseline for a user in their first weeks at an organization. Everything they do is genuinely new behavior, which means the system cannot reliably distinguish between legitimate first-time access and unauthorized activity. Most UEBA implementations handle this with a supervised grace period for new accounts, applying lighter scrutiny until a baseline has formed, and relying on other controls during that window.