On Wed, 18 Sept 2024 at 05:14, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > Ard Biesheuvel <ardb@xxxxxxxxxx> writes: > > > On Tue, 17 Sept 2024 at 17:24, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > >> > >> Ard Biesheuvel <ardb@xxxxxxxxxx> writes: > >> > >> > Hi Eric, > >> > > >> > Thanks for chiming in. > >> > >> It just looked like after James gave some expert input the > >> conversation got stuck, so I am just trying to move it along. > >> > >> I don't think anyone knows what this whole elephant looks like, > >> which makes solving the problem tricky. > >> > >> > On Mon, 16 Sept 2024 at 22:21, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > >> >> > > ... > >> >> > >> >> This leaves two practical questions if I have been following everything > >> >> correctly. > >> >> > >> >> 1) How to get kexec to avoid picking that memory for the new kernel to > >> >> run in before it initializes itself. (AKA the getting stomped by > >> >> relocate kernel problem). > >> >> > >> >> 2) How to point the new kernel to preserved tpm_log. > >> >> > >> >> > >> >> This recommendation is from memory so it may be a bit off but > >> >> the general structure should work. The idea is as follows. > >> >> > >> >> - Pass the information between kernels. > >> >> > >> >> It is probably simplest for the kernel to have a command line option > >> >> that tells the kernel the address and size of the tpm_log. > >> >> > >> >> We have a couple of mechanisms here. Assuming you are loading a > >> >> bzImage with kexec_file_load you should be able to have the in kernel > >> >> loader to add those arguments to the kernel command line. > >> >> > >> > > >> > This shouldn't be necessary, and I think it is actively harmful to > >> > keep inventing special ways for the kexec kernel to learn about these > >> > things that deviate from the methods used by the first kernel. This is > >> > how we ended up with 5 sources of truth for the physical memory map > >> > (EFI memory map, memblock and 3 different versions of the e820 memory > >> > map). > >> > > >> > We should try very hard to make kexec idempotent, and reuse the > >> > existing methods where possible. In this case, the EFI configuration > >> > table is already being exposed to the kexec kernel, which describes > >> > the base of the allocation. The size of the allocation can be derived > >> > from the table header. > >> > > >> >> - Ensure that when the loader is finding an address to load the new > >> >> kernel it treats the address of the tpm_log as unavailable. > >> >> > >> > > >> > The TPM log is a table created by the EFI stub loader, which is part > >> > of the kernel. So if we need to tweak this for kexec's benefit, I'd > >> > prefer changing it in a way that can accommodate the first kernel too. > >> > However, I think the current method already has that property so I > >> > don't think we need to do anything (modulo fixing the bug) > >> > >> I am fine with not inventing a new mechanism, but I think we need > >> to reuse whatever mechanism the stub loader uses to pass it's > >> table to the kernel. Not the EFI table that disappears at > >> ExitBootServices(). > >> > > > > Not sure what you mean here - the EFI table that gets clobbered by > > kexec *is* the table that is created by the stub loader to pass the > > TPM log to the kernel. Not sure what alternative you have in mind > > here. > > I was referring to whatever the EFI table that James Bottomley mentioned > that I presume the stub loader reads from when the stub loader > constructs the tpm_log that is passed to the kernel. > There is no such table. The event log is exposed by the firmware via a TCG2 protocol interface, which is no longer available after boot. So the stub loader (which is the last kernel component that has access to this interface) invokes this protocol and copies the output into a table in memory which is exposed to the kernel proper as a EFI configuration table. So the main issue here is that EFI configuration tables are passed on to kexec kernels, and we have to ensure (in the general case) that the associated memory is not reused. The implication is that the stub loader should always use EFI_ACPI_RECLAIM_MEMORY for allocations that are referenced via EFI configuration tables, and doing so for this table makes the bug go away. > So I believe we are in agreement of everything except terminology. > Sure. > >> > That said, I am doubtful that the kexec kernel can make meaningful use > >> > of the TPM log to begin with, given that the TPM will be out of sync > >> > at this point. But it is still better to keep it for symmetry, letting > >> > the higher level kexec/kdump logic running in user space reason about > >> > whether the TPM log has any value to it. > >> > >> Someone seems to think so or there would not be a complaint that it is > >> getting corrupted. > >> > > > > No. The problem is that the size of the table is *in* the table, and > > so if it gets corrupted, the code that attempts to memblock_reserve() > > it goes off into the weeds. But that does not imply there is a point > > to having access to this table from a kexec kernel in the first place. > > If there is no point to having access to it then we should just not > pass anything to the loaded kernel, so the kernel does not think there > is anything there. > > >> This should not be the kexec-on-panic kernel as that runs in memory > >> that is reserved solely for it's own use. So we are talking something > >> like using kexec as a bootloader. > >> > > > > kexec as a bootloader under TPM based measured boot will need to do a > > lot more than pass the firmware's event log to the next kernel. I'd > > expect a properly engineered kexec to replace this table entirely, and > > include the hashes of the assets it has loaded and measured into the > > respective PCRs. > > > > But let's stick to solving the actual issue here, rather than > > philosophize on how kexec might work in this context. > > > I am fine with that. The complaint I had seen was that the table was > being corrupted and asking how to solve that. It seems I haven't read > the part of the conversation where it was made clear that no one wants > the tpm_log after kexec. > It was not made clear, that is why I raised the question. I argued that the TPM log has limited utility after a kexec, given that we will be in one of two situations: - the kexec boot chain cares about the TPM and measured boot, and will therefore have extended the TPM's PCRs and the TPM log will be out of sync; - the kexec boot chain does not care, and so there is no point in forwarding the TPM log. Perhaps there is a third case where kdump wants to inspect the TPM log that the crashed kernel accessed? But this is rather speculative. > If someone wants the tpm_log then we need to solve this problem. > Agreed.