Hi Lennart, Thanks to you and Mikko for providing this background. On Thu, 25 Apr 2024 at 11:59, Lennart Poettering <mzxreary@xxxxxxxxxxx> wrote: > > On Mi, 24.04.24 19:15, Ard Biesheuvel (ardb@xxxxxxxxxx) wrote: > > > > > > > > > [ 0.000000] efi: ESRT=0xf0ea5040 TPMFinalLog=0xf0ea9040 > > > > > > > > RTPROP=0xf0ea7040 SMBIOS=0xf0ea3000 TPMEventLog=0xeb3b3040 > > > > > > > > INITRD=0xeb3b2040 RNG=0xe5c0f040 MEMRESERVE=0xe5c0e040 > > > > > > > > > > > > > > > > Different boards use different TPM HW and drivers so compiling > > > > > > > > all these in is possible but a bit ugly. systemd recently > > > > > > > > gained support for a specific tpm2.target which makes TPM > > > > > > > > support modular and also works with kernel modules for some TPM > > > > > > > > use cases but not rootfs encryption. > > > > > > > > > > > > > > > > In my test case we have a kernel+initramfs uki binary which is > > > > > > > > loaded by EFI firmware as a secure boot binary. TPM support on > > > > > > > > various boards is visible in devicetree but not as ACPI table > > > > > > > > entries. systemd currently detect TPM2 support either via ACPI > > > > > > > > table /sys/firmware/acpi/tables/TPM2 or TPM entry or via > > > > > > > > firmware measurement via > > > > > > > > /sys/kernel/security/tpm0/binary_bios_measurements > > > > > > > > . > > > > > > > > > > > > > > One corner case worth noting here is that scanning the device > > > > > > > tree won't always work for non-ACPI systems... The reason is that > > > > > > > a firmware TPM (running in OP-TEE) might or might not have a DT > > > > > > > entry, since OP-TEE can discover the device dynamically and > > > > > > > doesn't always rely on a DT entry. > > > > > > > > > > > > > > I don't particularly love the idea that an EventLog existence > > > > > > > automatically means a TPM will be there, but it seems that > > > > > > > systemd already relies on that and it does solve the problem we > > > > > > > have. > > > > > > > > > > > > Well, quite. That's why the question I was interested in, perhaps > > > > > > not asked as clearly as it could be is: since all the TPM devices > > > > > > rely on discovery mechanisms like ACPI or DT or the like which are > > > > > > ready quite early, should we simply be auto loading the TPM drivers > > > > > > earlier? > > > > > > > > > > This would be an elegant way to solve this and on top of that, we > > > > > have a single discovery mechanism from userspace -- e.g ls /dev/tpmX. > > > > > But to answer that we need more feedback from systemd. What 'earlier' > > > > > means? Autload it from the kernel before we go into launching the > > > > > initrd? > > > > > > > > Right, so this is another timing problem: we can't autoload modules > > > > *before* they appear in the filesystem and presumably they're on the > > > > initrd, so auto loading must be post initrd mount (and init execution) > > > > but otherwise quite early? > > > > > > Exactly. But is that enough? > > General purpose distros typically don't build all TPM drivers into the > kernel, but ship some in the initrd instead. Then, udev is responsible > for iterating all buses/devices and auto-loading the necessary > drivers. Each loaded bus driver might make more devices available for > which more drivers then need to be loaded, and so on. Some of the > busses are "slow" in the sense that we don't really know a precise > time when we know that all devices have now shown up, there might > always be slow devices that haven't popped up yet. Iterating through > the entire tree of devices in sysfs is often quite slow in itself too, > it's one of the most time consuming parts of the boot in fact. This > all is done asynchronously hence: we enumerate/trigger/kmod all > devices as quickly as we can, but we continue doing other stuff at the > same time. > > Of course that means that other stuff sometimes has to *wait* for > devices to show up. For example, if a harddisk shall be mounted, it > needs to be found/probed/kmod'ed first. Hence that's what we do: the > fscking/mounting of a file system is delayed exactly as long as it > takes for the block device it is for to show up. > > systemd these days makes use of the TPM — if available — for various > purposes, such as disk encryption, measuring boot phases and system > identity and various other things. Now, for the purpose of disk > encryption, we need to wait for two things: the hard drive, and the > TPM to be probed/driver loaded/accessible. /etc/fstab tells us pretty > explicitly what bloock device to wait for, hence it's easy. But > waiting for a TPM is harder: we might need it for disk encryption, but > we don't know right-away if there actually *is* a TPM device to show > up, and hence don't know whether to wait for it or not. > I take it this means that the LUKS metadata lacks a 'this key is sealed into the TPM' bit? Could you elaborate a bit on how the early boot code manages this? ... > > Exposing random firmware assets directly to user space to make guesses > > about this doesn't seem like a very robust approach to this issue. > > If you give us a generic flag file that says "firmware found and used > a tpm" somewhere in sysfs that abstracts the details how it detects > that is enough for us. i.e. i don't care if the kenrel abstracts this > or if we do more explicit checks in userspace. All i care is that it's > just a few access(F_OK) checks away for us. > So exposing the physical address of the TPM event log is probably not what we want here. Note that the TPM event log table is a Linux/efistub construct, whereas the TPM final log table actually comes from the firmware directly. So the former only exists if the EFI stub executed first, and managed to invoke the TCG protocol etc. OTOH, the TPM final log is TPM2 only, so it doesn't exist on TPM 1.2 Another thing we need to consider is TDX, which exposes a pseudo-TPM which does not support sealing, along with a CC protocol similar to the TCG2 protocol. This code will use the event log infrastructure as well: there are discussions going on at the moment whether we can improve the way these protocols are combined. So we should define a scope here: - do we need TPM1.2 support? - do we need non-EFI boot support? - do we need to do anything in particular for FDE on TDX, which has a TPM event log but no TPM is likely to appear. I am fine with adding a sysfs node under /sys/firmware/efi that exposes some of this information, e.g., linux_efi_tpm_eventlog::version, but not the physical address of the table.