On Tue, Feb 11, 2020 at 02:12:04PM -0800, Andy Lutomirski wrote: > On Tue, Feb 11, 2020 at 7:43 AM Joerg Roedel <joro@xxxxxxxxxx> wrote: > > > > On Tue, Feb 11, 2020 at 03:50:08PM +0100, Peter Zijlstra wrote: > > > > > Oh gawd; so instead of improving the whole NMI situation, AMD went and > > > made it worse still ?!? > > > > Well, depends on how you want to see it. Under SEV-ES an IRET will not > > re-open the NMI window, but the guest has to tell the hypervisor > > explicitly when it is ready to receive new NMIs via the NMI_COMPLETE > > message. NMIs stay blocked even when an exception happens in the > > handler, so this could also be seen as a (slight) improvement. > > > > I don't get it. VT-x has a VMCS bit "Interruptibility > state"."Blocking by NMI" that tracks the NMI masking state. Would it > have killed AMD to solve the problem they same way to retain > architectural behavior inside a SEV-ES VM? No, but it wouldn't solve the problem. Inside an NMI handler there could be #VC exceptions, which do an IRET on their own. Hardware NMI state tracking would re-enable NMIs when the #VC exception returns to the NMI handler, which is not what every OS is comfortable with. Yes, there are many ways to hack around this. The GHCB spec mentions the single-stepping-over-IRET idea, which I also prototyped in a previous version of this patch-set. I gave up on it when I discovered that NMIs that happen when executing in kernel-mode but on entry stack will cause the #VC handler to call into C code while on entry stack, because neither paranoid_entry nor error_entry handle the from-kernel-with-entry-strack case. This could of course also be fixed, but further complicates things already complicated enough by the PTI changes and nested-NMI support. My patch for using the NMI_COMPLETE message is certainly not perfect and needs changes, but having the message specified in the protocol gives the guest the best flexibility in deciding when it is ready to receive new NMIs, imho. Regards, Joerg