When is defense in depth unhelpful?

By OscarD🔸 @ 2025-10-29T21:42 (+25)

This is a linkpost to https://oscardelaney.substack.com/p/when-is-defense-in-depth-unhelpful

Defense in depth (DiD) is the idea that the best way to avoid a bad outcome is to spread your resources across several independent ‘layers’ of defense, rather than just building one layer of defense you think is super secure. I only ever hear DiD referred to positively, as something we should implement. This makes me suspicious: is it always a good idea? What are the relevant tradeoffs? In this post, I 1) More formally introduce DiD, 2) Make the case for DiD, 3) Consider when it might be counterproductive or less applicable, and 4) apply this to AI safety.

1. What is defense in depth?

Defense in depth originates from military strategy, so let’s start with that example. Suppose you don’t want your country’s capital to be invaded. The defense in depth thesis is that you are best off investing some resources from your limited military budget in many different defenses (e.g. nuclear deterrence; intelligence gathering and early warning systems; an air force, navy and army; command and communication bunkers; diplomacy and allies) rather than specialising heavily in just one.

In the case of AI safety, Hendrycks et al. propose such a multilayered defense approach.[1]

2. Why defense in depth?

There are several reasons to favour a DiD approach:

3. Why not defense in depth?

Here are some reasons not to favour DiD in some cases:

4. Application to AI safety

I think this has a few implications for AI safety:

I don’t think there are any specific things I want people to stop/start working on as a result of this. I just think it is useful to remember that DiD is not always the right frame or a strong consideration, and that it is very reliant on defenses being uncorrelated against an intelligent adversary.

—-------

Thanks to Anders Sandber, Christopher Covino, Max Daniel, Mia Taylor, Oliver Guest, and Owen Cotton-Barratt for helpful comments on a draft.

  1. ^

     More generally, one application of DiD to X-risk suggests investing in each of three defensive buckets:

    • Preventing catastrophes from starting in the first place.
    • Responding in the early stages to prevent escalation to a global catastrophe.
    • Resiliency, such that even global catastrophes do not lead to extinction.
  2. ^

     AI image modification still has a ways to go!


Chris Leong @ 2025-10-30T09:03 (+11)

An analogy: let's suppose you're trying to stop a tank. You can't just place a line of 6 kids in front of it and call it "defense in depth".

Also, it would be somewhat weird to call it "defense in depth" if most of the protection came from a few layers.

niplav @ 2025-10-30T23:01 (+1)

I've been confused about the "defense-in-depth" cheese analogy. The analogy works in two dimensions, and we can visualize that constructing multiple barriers with holes will block any path from a point out of a three-dimensional sphere.

(What follows is me trying to think through the mathematics, but I lack most of the knowledge to evaluate it properly. Johnson-Lindenstrauss may be involved in solving this? (it's not, GPT-5 informs me))

But plans in the the real world real world are very high-dimensional, right? So we're imagining a point (let's say at ) in a high-dimensional space (let's say for large , as an example), and an -sphere around that point. Our goal is that there is no straight path from to somewhere outside the sphere. Our possible actions are that we can block off sub-spaces within the sphere, or construct n-dimensional barriers with "holes", inside the sphere, to prevent any such straight paths. Do we know the scaling properties of how many of such barriers we have to create, given such-and-such "moves" with some number of dimensions/porosity?

My purely guessed intuition is that, at least if you're given porous -dimensional "sheets" you can place inside of the -sphere, that you need many of them with increasing dimensionality . Nevermind, I was confused about this.

zeshen🔸 @ 2025-10-30T20:31 (+1)

The defense in depth thesis is that you are best off investing some resources from your limited military budget in many different defenses (e.g. nuclear deterrence; intelligence gathering and early warning systems; an air force, navy and army; command and communication bunkers; diplomacy and allies) rather than specialising heavily in just one.

I'm not familiar with how this concept is used in the military, but in safety engineering I've never heard of it as a tradeoff between 'many layers, many holes' vs 'one layer, few holes'. The swiss cheese model is often meant to illustrate the fact that your barriers are often not 100% effective, so even if you think you have a great barrier, you should have more than one of it. From this perspective, the concept of having multiple barriers is straightforwardly good and doesn't imply justifying the use of weaker barriers. 

OscarD🔸 @ 2025-10-30T20:37 (+1)

Right, but because we have limited resources, we need to choose whether to invest more in just a few stronger layers, or less each in more different layers. Of course in an ideal world we have heaps of really strong layers, but that may be cost-prohibitive.