On Fri, 20 Dec 2019 at 09:17, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > > > > > On Dec 20, 2019, at 3:07 PM, Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote: > > > > On Thu, 19 Dec 2019 at 22:05, Matthew Garrett <mjg59@xxxxxxxxxx> wrote: > >> > >>> On Wed, Dec 18, 2019 at 9:03 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > >>> + status = efi_call_early(create_event, EVT_SIGNAL_EXIT_BOOT_SERVICES, > >>> + TPL_CALLBACK, handle_exit_boot_services_event, > >>> + NULL, &exit_boot_services_event); > >>> + if (status != EFI_SUCCESS) { > >>> + pr_efi_err("Failed to register for EBS() event\n"); > >>> + goto free_handle; > >>> + } > >> > >> OVMF's SEV implementation appears to tear down AMD's IOMMU mappings at > >> EVT_SIGNAL_EXIT_BOOT_SERVICES. How are we ensuring that this happens > >> first? > > > > It doesn't, and that is kind of the point. The only guarantee you have > > is that this runs before ExitBootServices() returns, but after any > > other callbacks that have been registered. I know this is not 100% > > what you're after, but it is the only way we can avoid poking devices > > behind the backs of their drivers. > > > > Can you clarify (in the changelog or a comment perhaps) why you’re doing this instead of turning off busmastering before calling ExitBootServices()? Maybe this was covered in this thread, but I missed it. > Sure. The problem is that EBS() is the place where drivers tear down rings etc and gracefully take down the device. So killing DMA for all of them by clearing the BM bit of every bridge is likely to cause problems, because the teardown code wasn't written with the idea in mind that DMA is no longer possible. On arm64 at least, it may result in the kernel being entered with a pending SError which will kill the OS as soon as they are unmasked. But the UEFI drivers themselves may simply hang or timeout on some DMA access. > Also, surely this whole mess is a a design error in EFI, at least when SEV is involved, and there should be an EFI extension to keep IOMMU enabled. Or a specified way to *guarantee* that DMA is off when we exit boot services without hackery. The UEFI spec requires that all devices stop doing DMA at ExitBootServices() time, which is why they register for this event and disable DMA in their drivers. So from this pov, the state of the IOMMU is irrelevant since no device should be doing DMA anyway. The UEFI spec does not reason at all about IOMMUs. But indeed, the whole idea that the IOMMU is a 'feature' that you can ignore if you want since it will be in passthrough mode otherwise is misguided, but I'm not sure this is a firmware problem. The desire to be able to run yesterday's OS on today's hardware, especially in the Windows world, resulted in a situation where many security and hardening features are opt-in rather than opt-out, with the adoption you might expect.