Detection Engineering Program - Part 4 - Detection Testing & Validation

brencronin
Apr 19
2 min read

Detection Testing & Validation

Detection Measurement Concepts

Understanding how to evaluate a detection's performance is essential for tuning rules, prioritizing engineering efforts, and managing SOC efficiency. Key measurement concepts include:

1. Precision (a.k.a. True Positive Rate / Confidence): This measures how many of the triggered alerts are actually true positives.

Formula: True Positives / (True Positives + False Positives)
High precision means minimal false positives, alerts that fire are likely real.
Often referred to as confidence in a detection.
Noisy detections (high false positive rates) are considered to have low precision and low confidence.

2. Recall (a.k.a. Detection Coverage): This measures how many actual malicious events the detection successfully identifies.

Formula: True Positives / (True Positives + False Negatives)
High recall means few threats go undetected.
There is often a tradeoff between precision and recall, tightening a detection to reduce noise can cause it to miss true positives.

3. Robustness (a.k.a. Detection Durability or Breadth): Robustness describes how resilient and wide-reaching a detection is.

A robust detection can withstand evasion attempts and catch varied forms of a technique or behavior.
It reflects the breadth of coverage, how many different variations of a technique it can detect.
Increasing robustness often reduces precision, as broader logic may catch more benign activity.

4. Severity: This refers to the impact level of what’s being detected.

5. Detection Efficacy (a.k.a. Detection Value or Worth): This is a holistic measure of how effective a detection is, factoring in both its utility and its cost.

Balances precision, recall, robustness, and severity against the resources needed to build, tune, and triage it.
A strong detection catches meaningful threats, minimizes false positives, and doesn’t overwhelm your analysts or budget.
Even highly accurate detections can become unsustainable if they are expensive to maintain or too complex to operate.

A “good” detection isn’t just accurate, it’s actionable, efficient, and sustainable.

Detection Testing & Validation – Continuously testing detections using adversary emulation (e.g., MITRE CALDERA, Atomic Red Team) to reduce false positives/negatives.

FPs

https://www.rapid7.com/blog/post/2020/05/15/moving-toward-a-better-signature-metric-in-socs-detection-efficacy/

Testing detections:

https://detect.fyi/attackrulemap-bridging-open-source-detections-and-atomic-tests-93420708a70f

Each alert / detection strategy must have true positive validation. This is a testing process designed to prove the true positives are detected.

True positive validation relies on generating a scenario in which the detection strategy is testing, and then validating in the tool.

To perform positive validation: