DOSing the TPM to leak the rootfs encryption key

Adrian Vovk <adrianvovk@xxxxxxxxx> · Mon, 10 Mar 2025 15:04:38 -0400

Hello all,
This is spawned from another recent thread on this list, with the subject "Is tpm2-measure-pcr really an additional security?", started by Yann Diorcet. There's some confusion of what scenario exactly is being discussed in that thread, and in an attempt to clarify I think I came up with my own scenario that I think might be similarly vulnerable. So, I'm making this thread so that we can a clean slate for the discussion of my scenario.

Presuming a system like this:
- We've got a Linux desktop system
- We have two dm-verity protected /usr partitions
- We have one encrypted rootfs
- We're using systemd-repart to create the rootfs on first boot
- The rootfs is automatically unlocked by the initrd, at boot, using the TPM. We're using systemd-pcrlock with a PCR 11 signature
- We're making use of systemd-pcrphase, and the rootfs is unlockable _only_ during `enter-initrd`. Once we hit `leave-initrd`, the rootfs is no longer unlockable. This means that the rootfs is only unlockable in the initrd.

Here's a potential attack:
- Attacker makes a clone of the real rootfs, then marks the partition as empty on the real disk. Attacker puts the disk back into the machine, and boots.
- The initrd measures the `enter-initrd` pcrphase
- systemd-repart notices that there's no rootfs, and starts creating one. This (in my experience) takes up to a couple of seconds, so it should be relatively feasible to time the attack. Especially if the attacker can see systemd's status output
- Meanwhile, the attacker starts DOSing the TPM (i.e. by physically interrupting the TPM's LPC bus)
- systemd-repart finishes, and so the initrd boots into the newly-created rootfs
- initrd tries to measure the `leave-initrd` pcrphase, or the new rootfs's volume key. Both fail, due to the DOS on the TPM.
- The system boots to the new rootfs
- Attacker goes through initial-setup, creating a new administrator user for themself. Then they log in, open a terminal, and run `run0`. They type in the password they just set up. The attacker is now at a root shell.
- Attacker stops DOSing the TPM
- Since the TPM never received the measurements for `leave-initrd`, or the new rootfs's volume key, it is still in the same state as it was in the initrd.
- Attacker asks the TPM to unlock the real rootfs
- The PCRs are all in the correct state.
- The TPM gives the attacker the real rootfs's encryption key.
- Attacker gained access to the real rootfs's data
- Attacker cleans up all traces of their presence by putting back the real rootfs into its old partition

What's protecting against this, if anything?

In theory, this should all be detectable. The pcrphase measurements should fail if the TPM fails to reply that it has handled the command, and also pcrphase should probably _read back_ the PCRs and make sure they're at the expected value. Unfortunately, I don't think we actually care at the moment if these measurements fail - they can fail and we'll keep booting.

Detecting the situation and causing boot to fail, as described above, would force the attacker to not only DOS the TPM but actually completely MITM it. Is this possible? Is this something that parameter encryption defends against?

This attack isn't specific to the assumptions I've made at the start. For example: instead of relying on systemd-repart to create the rootfs on first boot and then going through the legitimate first-boot setup process, an attacker can simply bring their own maliciously-crafted rootfs that boots straight to a root shell. systemd's security model allows an attacker to completely replace the rootfs with a maliciously-crafted one.

PS: We should probably be locking the rootfs down to the initial state of PCR 15 too. Since we can assume that PCR 15 must be zeroed out before we unlock the rootfs, we can actually check that. This adds an extra layer of protection: the rootfs can only be unlocked if it's the first rootfs being unlocked. But still: this doesn't defend against the DOS scenario I describe here, since the attack completely bypasses the measurement into PCR 15

Best,
Adrian