Quoting Stefan Hajnoczi (2023-12-05 16:27:51) > On Tue, 5 Dec 2023 at 17:12, Michael Roth <michael.roth@xxxxxxx> wrote: > > > > Commit 7191f24c7fcf ("accel/kvm/kvm-all: Handle register access errors") > > added error checking for KVM_SET_SREGS/KVM_SET_SREGS2. In doing so, it > > exposed a long-running bug in current KVM support for SEV-ES where the > > kernel assumes that MSR_EFER_LMA will be set explicitly by the guest > > kernel, in which case EFER write traps would result in KVM eventually > > seeing MSR_EFER_LMA get set and recording it in such a way that it would > > be subsequently visible when accessing it via KVM_GET_SREGS/etc. > > > > However, guests kernels currently rely on MSR_EFER_LMA getting set > > automatically when MSR_EFER_LME is set and paging is enabled via > > CR0_PG_MASK. As a result, the EFER write traps don't actually expose the > > MSR_EFER_LMA even though it is set internally, and when QEMU > > subsequently tries to pass this EFER value back to KVM via > > KVM_SET_SREGS* it will fail various sanity checks and return -EINVAL, > > which is now considered fatal due to the aforementioned QEMU commit. > > > > This can be addressed by inferring the MSR_EFER_LMA bit being set when > > paging is enabled and MSR_EFER_LME is set, and synthesizing it to ensure > > the expected bits are all present in subsequent handling on the host > > side. > > > > Ultimately, this handling will be implemented in the host kernel, but to > > avoid breaking QEMU's SEV-ES support when using older host kernels, the > > same handling can be done in QEMU just after fetching the register > > values via KVM_GET_SREGS*. Implement that here. > > Hi Mike, > I am holding off on tagging 8.2.0-rc3 for one day so agreement can be > reached on how to proceed with this fix. Thanks Stefan. Sorry for the late fix, but without it SEV-ES is completely busted, so we're hoping it's simple/specific enough to justify for hard-freeze. Also, I just sent a v2 that adds similar handling for older kernels that don't support KVM_SET_SREGS2. -Mike > > Stefan > > > > > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > Cc: Marcelo Tosatti <mtosatti@xxxxxxxxxx> > > Cc: Tom Lendacky <thomas.lendacky@xxxxxxx> > > Cc: Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> > > Cc: kvm@xxxxxxxxxxxxxxx > > Fixes: 7191f24c7fcf ("accel/kvm/kvm-all: Handle register access errors") > > Signed-off-by: Michael Roth <michael.roth@xxxxxxx> > > --- > > target/i386/kvm/kvm.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c > > index 11b8177eff..0e9e4c1beb 100644 > > --- a/target/i386/kvm/kvm.c > > +++ b/target/i386/kvm/kvm.c > > @@ -3654,6 +3654,7 @@ static int kvm_get_sregs2(X86CPU *cpu) > > { > > CPUX86State *env = &cpu->env; > > struct kvm_sregs2 sregs; > > + target_ulong cr0_old; > > int i, ret; > > > > ret = kvm_vcpu_ioctl(CPU(cpu), KVM_GET_SREGS2, &sregs); > > @@ -3676,12 +3677,18 @@ static int kvm_get_sregs2(X86CPU *cpu) > > env->gdt.limit = sregs.gdt.limit; > > env->gdt.base = sregs.gdt.base; > > > > + cr0_old = env->cr[0]; > > env->cr[0] = sregs.cr0; > > env->cr[2] = sregs.cr2; > > env->cr[3] = sregs.cr3; > > env->cr[4] = sregs.cr4; > > > > env->efer = sregs.efer; > > + if (sev_es_enabled() && env->efer & MSR_EFER_LME) { > > + if (!(cr0_old & CR0_PG_MASK) && env->cr[0] & CR0_PG_MASK) { > > + env->efer |= MSR_EFER_LMA; > > + } > > + } > > > > env->pdptrs_valid = sregs.flags & KVM_SREGS2_FLAGS_PDPTRS_VALID; > > > > -- > > 2.25.1 > > > >