On Wed, 2022-06-29 at 09:31 -0700, Jim Mattson wrote: > On Tue, Jun 21, 2022 at 8:09 AM Maxim Levitsky <mlevitsk@xxxxxxxxxx> wrote: > > When #SMI is asserted, the CPU can be in interrupt shadow > > due to sti or mov ss. > > > > It is not mandatory in Intel/AMD prm to have the #SMI > > blocked during the shadow, and on top of > > that, since neither SVM nor VMX has true support for SMI > > window, waiting for one instruction would mean single stepping > > the guest. > > > > Instead, allow #SMI in this case, but both reset the interrupt > > window and stash its value in SMRAM to restore it on exit > > from SMM. > > > > This fixes rare failures seen mostly on windows guests on VMX, > > when #SMI falls on the sti instruction which mainfest in > > VM entry failure due to EFLAGS.IF not being set, but STI interrupt > > window still being set in the VMCS. > > I think you're just making stuff up! See Note #5 at > https://sandpile.org/x86/inter.htm. > > Can you reference the vendors' documentation that supports this change? > First of all, just to note that the actual issue here was that we don't clear the shadow bits in the guest interruptability field in the vmcb on SMM entry, that triggered a consistency check because we do clear EFLAGS.IF. Preserving the interrupt shadow is just nice to have. That what Intel's spec says for the 'STI': "The IF flag and the STI and CLI instructions do not prohibit the generation of exceptions and nonmaskable inter- rupts (NMIs). However, NMIs (and system-management interrupts) may be inhibited on the instruction boundary following an execution of STI that begins with IF = 0." Thus it is likely that #SMI are just blocked when in shadow, but it is easier to implement it this way (avoids single stepping the guest) and without any user visable difference, which I noted in the patch description, I noted that there are two ways to solve this, and preserving the int shadow in SMRAM is just more simple way. As for CPUS that neither block SMI nor preserve the int shadaw, in theory they can, but that would break things, as noted in this mail https://lore.kernel.org/lkml/1284913699-14986-1-git-send-email-avi@xxxxxxxxxx/ It is possible though that real cpu supports HLT restart flag, which makes this a non issue, still. I can't rule out that a real cpu doesn't preserve the interrupt shadow on SMI, but I don't see why we can't do this to make things more robust. Best regards, Maxim Levitsky