On Wed, Dec 06, 2023 at 12:48:35PM +0100, Philippe Mathieu-Daudé wrote: > Hi Michael, > > (Cc'ing Lara, Vitaly and Maxim) > > On 5/12/23 23:28, Michael Roth wrote: > > Commit 7191f24c7fcf ("accel/kvm/kvm-all: Handle register access errors") > > added error checking for KVM_SET_SREGS/KVM_SET_SREGS2. In doing so, it > > exposed a long-running bug in current KVM support for SEV-ES where the > > kernel assumes that MSR_EFER_LMA will be set explicitly by the guest > > kernel, in which case EFER write traps would result in KVM eventually > > seeing MSR_EFER_LMA get set and recording it in such a way that it would > > be subsequently visible when accessing it via KVM_GET_SREGS/etc. > > > > However, guests kernels currently rely on MSR_EFER_LMA getting set > > automatically when MSR_EFER_LME is set and paging is enabled via > > CR0_PG_MASK. As a result, the EFER write traps don't actually expose the > > MSR_EFER_LMA even though it is set internally, and when QEMU > > subsequently tries to pass this EFER value back to KVM via > > KVM_SET_SREGS* it will fail various sanity checks and return -EINVAL, > > which is now considered fatal due to the aforementioned QEMU commit. > > > > This can be addressed by inferring the MSR_EFER_LMA bit being set when > > paging is enabled and MSR_EFER_LME is set, and synthesizing it to ensure > > the expected bits are all present in subsequent handling on the host > > side. > > > > Ultimately, this handling will be implemented in the host kernel, but to > > avoid breaking QEMU's SEV-ES support when using older host kernels, the > > same handling can be done in QEMU just after fetching the register > > values via KVM_GET_SREGS*. Implement that here. > > > > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > Cc: Marcelo Tosatti <mtosatti@xxxxxxxxxx> > > Cc: Tom Lendacky <thomas.lendacky@xxxxxxx> > > Cc: Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> > > Cc: kvm@xxxxxxxxxxxxxxx > > Fixes: 7191f24c7fcf ("accel/kvm/kvm-all: Handle register access errors") > Hi Philippe, > This 'Fixes:' tag is misleading, since as you mentioned this commit > only exposes the issue. That's true, a "Workaround-for: " tag or something like that might be more appropriate. I just wanted to make it clear that SEV-ES support is no longer working with that patch applied, so I used Fixes: and elaborated on the commit message. I can change it if there's a better way to convey this though. > > Commit d499f196fe ("target/i386: Added consistency checks for EFER") > or around it seems more appropriate. Those checks seem to be more for TCG. The actual bug is in the host kernel, and it seems to have been there basically since the original SEV-ES host support went in in 2020. I've also sent a patch to address this in KVM: https://lore.kernel.org/lkml/20231205234956.1156210-1-michael.roth@xxxxxxx/T/#u but in the meantime it means that QEMU 8.2+ SEV-ES support would no longer work for any current/older host kernels, so I'm hoping a targeted workaround is warranted to cover that gap. > > Is this feature easily testable on our CI, on a x86 runner with KVM > access? SEV-ES support was introduced with EPYC Zen2 architecture (EPYC 7002 series processors, aka "Rome"). If there are any systems in the test pool that are Zen2 or greater, then a simple boot of a SEV-ES linux guest would be enough to trigger the QEMU crash. I'm not sure if there are any systems of that sort in the pool though. Thanks, Mike