On Thu, Aug 04, 2022 at 03:56:54PM +0200, Laszlo Ersek wrote: > On 08/04/22 15:28, Jason A. Donenfeld wrote: > > On Thu, Aug 4, 2022 at 3:25 PM Laszlo Ersek <lersek@xxxxxxxxxx> wrote: > >> > >> On 08/04/22 14:16, Ard Biesheuvel wrote: > >>> On Thu, 4 Aug 2022 at 14:11, Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote: > >>>> > >>>> On Thu, Aug 04, 2022 at 02:03:29PM +0200, Jason A. Donenfeld wrote: > >>>>> Hi Daniel, > >>>>> > >>>>> On Thu, Aug 04, 2022 at 10:25:36AM +0100, Daniel P. Berrangé wrote: > >>>>>> Yep, and ultimately the inability to distinguish UEFI vs other firmware > >>>>>> is arguably correct by design, as the QEMU <-> firmware interface is > >>>>>> supposed to be arbitrarily pluggable for any firmware implementation > >>>>>> not limited to merely UEFI + seabios. > >>>>> > >>>>> Indeed, I agree with this. > >>>>> > >>>>>> > >>>>>>> For now I suggest either reverting the original patch, or at least not > >>>>>>> enabling the knob by default for any machine types. In particular, when > >>>>>>> using MicroVM, the user must leave the knob disabled when direct booting > >>>>>>> a kernel on OVMF, and the user may or may not enable the knob when > >>>>>>> direct booting a kernel on SeaBIOS. > >>>>>> > >>>>>> Having it opt-in via a knob would defeat Jason's goal of having the seed > >>>>>> available automatically. > >>>>> > >>>>> Yes, adding a knob is absolutely out of the question. > >>>>> > >>>>> It also doesn't actually solve the problem: this triggers when QEMU > >>>>> passes a DTB too. It's not just for the new RNG seed thing. This bug > >>>>> isn't new. > >>>> > >>>> In the other thread I also mentioned that this RNG Seed addition has > >>>> caused a bug with AMD SEV too, making boot measurement attestation > >>>> fail because the kernel blob passed to the firmware no longer matches > >>>> what the tenant expects, due to the injected seed. > >>>> > >>> > >>> I was actually expecting this to be an issue in the > >>> signing/attestation department as well, and you just confirmed my > >>> suspicion. > >>> > >>> But does this mean that populating the setup_data pointer is out of > >>> the question altogether? Or only that putting the setup_data linked > >>> list nodes inside the image is a problem? > >> > >> QEMU already has to inject a whole bunch of stuff into confidential > >> computing guests. The way it's done (IIRC) is that the non-compressed, > >> trailing part of pflash (basically where the reset vector code lives > >> too) is populated at OVMF build time with a chain of GUID-ed structures, > >> and fields of those structures are filled in (at OVMF build time) from > >> various fixed PCDs. The fixed PCDs in turn are populated from the FD > >> files, using various MEMFD regions. When QEMU launches the guest, it can > >> parse the GPAs from the on-disk pflash image (traversing the list of > >> GUID-ed structs), at which addresses the guest firmware will then expect > >> the various crypto artifacts to be injected. > >> > >> The point is that "who's in control" is reversed. The firmware exposes > >> (at build time) at what GPAs it can accept data injections, and QEMU > >> follows that. Of course the firmware ensures that nothing else in the > >> firmware will try to own those GPAs. > >> > >> The only thing that needed to be hard-coded when this feature was > >> introduced was the "entry point", that is, the flash offset at which > >> QEMU starts the GUID-ed structure traversal. > >> > >> AMD and IBM developers can likely much better describe this mechanism, > >> as I've not dealt with it in over a year. The QEMU side code is in > >> "hw/i386/pc_sysfw_ovmf.c". > >> > >> We can make setup_data chaining work with OVMF, but the whole chain > >> should be located in a GPA range that OVMF dictates. > > > > It sounds like what you describe is pretty OVMF-specific though, > > right? > > Yes, completely OVMF specific. > > > Do we want to tie things together so tightly like that? > > It depends on what the goal is: > > - do we want setup_data chaining work generally? > > - or do we want only the random seed injection to stop crashing OVMF guests? > > In the latter case, we only need to teach QEMU to reliably recognize > OVMF from the on-disk firmware binary (regardless of pflash use), and > then not expose any setup_data in guest RAM. The UEFI guest kernel will > use EFI_RNG_PROTOCOL (when available, from virtio-rng-pci), and no > particular seed otherwise. > > (This is the "papering over" case, which I don't think is necessarily > wrong; only it should be robust. I thought pflash would be usable for > that; turns out it isn't. Thus, we could perhaps try identifying a > firmware binary as "OVMF" by contents.) > > In the former case though, I think the "tight coupling" is unavoidable. > As long as setup_data is needed by the kernel extremely early, no > "firmware hiding" or "firmware independence" is in place yet. > > > Given we only need 48 bytes or so, isn't there a more subtle place we > > could just throw this in ram that doesn't need such complex > > coordination? > > These tricks add up and go wrong after a while. The pedantic > reservations in the firmware have proved necessary. > > IIUC, with v2, the setup_data_base address would (most frequently) be 96 > KB. edk2 does have uses for very low memory. If OVMF's PlatformPei does > not reserve away the area, UefiCpuPkg or other drivers might allocate an > overlapping chunk, even if only temporarily. That might not break the > firmware, but it could overwrite the random seed. If the guest kernel > somehow consumed that seed (rather than getting one from > EFI_RNG_PROTOCOL -- what if no virtio-rng-pci device was configured?), > that could potentially destroy the guest's security, without any > obvious/immediate signs. The guest kernel shouldn't load this random seed even if it is provided by the hypervisor, when running in a confidential virutalization environment like SEV/TDX or equiv for other arches. It can't trust the hypervisor to have actually used random data for populating the seed value. On x86 it would have to rely on the hardware RDRAND/RDSEED for an initial seed. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|