Nobody Reads the Diagram
I've sat through more threat modeling sessions than I care to count, and here's what actually happens: someone shares their screen, pulls up a data flow diagram with thirty boxes and sixty arrows, and then proceeds to walk through it like they're reading a phone book. Thirty minutes later, everyone has agreed that the admin panel should require authentication and the database connection should be encrypted in transit. The "threat model" gets saved to Confluence, never to be opened again.
That is not threat modeling. That is a documentation exercise with better marketing.
The diagram isn't the point. The diagram is a prop — a shared artifact that forces people to articulate what they actually built versus what they thought they built. The real work is the argument that happens around it. The confusion, the corrections, the moment someone says "wait, you're sending that where?" — that's the threat model. Adam Shostack, who literally wrote the book on this (and I mean that literally — Threat Modeling: Designing for Security, 2014), frames it around four questions: What are we building? What can go wrong? What are we going to do about it? Did we do a good enough job? Most teams are living on questions three and four while completely skipping the first two. They're prescribing treatment before finishing the diagnosis.
The "What Are We Building" Problem Is Worse Than You Think
I worked with a team that had a mature security program — bug bounty, quarterly pentests, a dedicated AppSec engineer. They came to me wanting to formalize their threat modeling process. First thing I asked: walk me through the trust boundaries in your authentication flow. Silence. Not because they didn't understand trust boundaries conceptually, but because nobody had ever sat down and mapped where their identity provider's token ended and their internal session management began. Three different engineers on the same team had three different mental models of how their own system worked.
This is where the choice of diagramming approach actually matters, and it's not a trivial decision. DFDs (data flow diagrams) are the traditional tool — they're what Microsoft's Threat Modeling Tool was built around, and they force you to think in terms of data stores, processes, external entities, and data flows. That structure matters because trust boundaries cut across those elements in ways that are immediately visible. But DFDs are terrible at capturing temporal behavior — the sequence of events that create a window for attack. A race condition in an OAuth token exchange won't show up cleanly in a DFD. For those scenarios, sequence diagrams are genuinely better. They show state transitions and timing dependencies that DFDs flatten into a single arrow.
The tooling argument — Microsoft Threat Modeling Tool versus OWASP Threat Dragon versus Threagile — mostly misses the point, but it's worth addressing briefly because teams waste real time on it. Threat Dragon is open-source and integrates decently into a pipeline; Threagile generates risk findings from YAML-defined architectures and is the most interesting option for teams doing infrastructure-as-code threat modeling because you can version it alongside your Terraform. The Microsoft tool is fine if your stack is MSFT-heavy and you want STRIDE categories surfaced automatically. None of them are going to do the thinking for you. The tool is a whiteboard with better export functionality.
STRIDE Is Not Your Enemy (But You're Probably Using It Wrong)
There's a backlash against STRIDE in some corners of the security community, and I partially get it. STRIDE — Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege — is a threat categorization mnemonic, not a methodology. Teams treat it like a checklist. Go through each category, confirm it exists as a concern, check the box, done. That's not analysis, that's theater.
But here's the thing: the categories are actually well-chosen. They map loosely to the security properties you care about — authentication, integrity, non-repudiation, confidentiality, availability, authorization. When you apply STRIDE per-element on a DFD — meaning you evaluate each threat category against each process, data store, data flow, and external entity separately — you get substantive coverage. The problem isn't STRIDE, it's that people apply it to the diagram as a whole instead of systematically working through it element by element. Microsoft's own guidance on this is actually pretty solid if you go back to the original Loren Kohnfelder and Praerit Garg paper from 1999.
PASTA (Process for Attack Simulation and Threat Analysis) takes a completely different approach — it's risk-centric and attack simulation-focused, seven stages that move from defining business objectives through attack modeling to residual risk quantification. It's more rigorous and significantly more resource-intensive. Realistically, it's appropriate for high-value systems where you're trying to connect threat likelihood to actual business impact. I've seen it done well exactly twice. Most organizations that claim they're doing PASTA are doing an expensive version of the STRIDE checklist problem.
LINDDUN is the one that doesn't get enough attention. It's specifically designed for privacy threat modeling — Linkability, Identifiability, Non-repudiation, Detectability, Disclosure of information, Unawareness, Non-compliance. If you're building anything that handles personal data and you're not running LINDDUN alongside STRIDE, you have a gap. GDPR isn't going to care that your authentication is solid if your logging infrastructure creates a behavioral profile of every user action. LINDDUN was developed by KU Leuven researchers and there's solid academic backing for it; it's not just someone's blog post that got popular.
The Annual Threat Model Is a Liability
Here's a scenario that plays out constantly: a team does a threat model during the design phase of a major feature. It's actually pretty good — they found real issues, they documented mitigations, someone filed tickets. The document gets filed under "Security" in Confluence. Eighteen months later, the architecture has been refactored twice, a new microservice was spun up to handle the payment processing, the original DFD is completely wrong, and nobody updated the threat model. When a security incident occurs, the threat model is cited as evidence that the team "had controls in place." It describes a system that no longer exists.
The shift to treating threat models as living documents that live in version control — not as PDFs in a wiki — is one of the more meaningful operational changes I've seen teams make. Threagile's approach of defining your threat model in YAML that gets committed alongside your application code is the right instinct here. When your IaC changes — when you add an S3 bucket, when you modify a security group rule, when you introduce a new service-to-service call — the threat model should be part of that review. Not separately. Not in a different ticket. Same PR.
Some organizations are running automated checks against IaC using tools like Checkov or tfsec, and while those are valuable, they're checking for known misconfigurations against a ruleset. They don't understand your architecture. They don't know that the reason you opened that port is because of a specific integration and whether that integration has compensating controls. Automated scanning and threat modeling are complementary but they're not substitutes for each other.
Who Should Actually Be in the Room
Segment published something a while back about their approach to developer-led threat modeling, and the core insight was that the people closest to the code need to own the threat model — not have it done to them by a security team. This is directionally correct. When threat modeling is something AppSec does and then hands to developers as a list of findings, you get compliance behavior. Developers implement the specific mitigations on the list and nothing else. When developers are doing the threat modeling themselves, they build intuition about attack surface. They start asking "what's the trust boundary here?" during design discussions before anyone from security is in the room.
The security team's role in that model shifts to facilitation, calibration, and coverage verification — making sure the developers aren't missing categories, making sure findings are tied to an actual threat library rather than vague concerns. CAPEC (Common Attack Pattern Enumeration and Classification) and ATT&CK are both useful for this. ATT&CK gets used for adversary simulation and detection engineering, but it's underutilized in threat modeling to tie a "spoofing" finding to a specific technique (T1078 — Valid Accounts, for example) and from there to realistic attacker behavior. That connection between abstract threat category and concrete attack technique is what separates a useful threat model from an academic exercise.
There's also a real distinction between a threat model and a security architecture review that teams conflate constantly. A security architecture review is evaluating whether your architecture meets security requirements — it's compliance-oriented, it checks controls, it produces a yes/no against a framework. A threat model is adversarially focused — it's asking what a motivated attacker would do. They should both exist, they serve different purposes, and treating them as the same activity produces a hybrid that does neither well.
Abuse Cases vs. Misuse Cases (This Distinction Actually Matters)
Most teams write use cases for their features. Occasionally, teams will write abuse cases — descriptions of how a legitimate feature can be used for illegitimate purposes. Credential stuffing against your login endpoint is an abuse case. Scraping your public API to aggregate competitor pricing is an abuse case. These are functionally important because they describe attackers using your system as designed, just not for the purpose you intended.
Misuse cases are different — they describe attacks that go against the intended operation of the system. SQL injection, buffer overflows, authentication bypass. The distinction isn't just semantic; it drives different defensive thinking. Abuse cases usually require business logic controls — rate limiting, behavioral analytics, fraud detection. Misuse cases require technical controls — input validation, parameterized queries, memory-safe languages. If you're only thinking about one category, you're leaving a gap.
Guttorm Sindre and Andreas Opdahl formalized misuse cases in their 2005 IEEE paper, and it's worth going back to if you want the original framing. The threat modeling community has largely absorbed the concept without necessarily maintaining the distinction, which is a shame because the operational implications for what kind of control you implement are genuinely different.
What a Good Session Actually Looks Like
The best threat modeling session I've ever been part of lasted four hours and produced a diagram that was wrong by the end of it. We started with what we thought the architecture was, and by the time we'd actually traced every data flow and labeled every trust boundary, we'd discovered three undocumented service-to-service calls, an internal API that had no authentication because "it's internal" (it was accessible from the DMZ), and a logging pipeline that was shipping full request bodies to a third-party analytics vendor in a way nobody had consciously decided to do. The diagram was wrong. The conversation was invaluable.
That's the version of threat modeling worth doing. Not the annual compliance artifact. Not the diagram review where everyone agrees the database should be encrypted. The actual, uncomfortable, occasionally embarrassing conversation about what you built, whether it does what you think it does, and what a competent attacker would do with the gaps you just found.
The diagram is just a way to make that conversation possible. Don't confuse it for the work.


