RE: [PATCH 2/3] KVM: x86: Don't update KVM PV feature CPUID during vCPU running

Kechen Lu <kechenl@xxxxxxxxxx> · Thu, 6 Apr 2023 18:30:47 +0000

Hi Sean,

> -----Original Message-----
> From: Sean Christopherson <seanjc@xxxxxxxxxx>
> Sent: Wednesday, April 5, 2023 8:29 PM
> To: Hou Wenlong <houwenlong.hwl@xxxxxxxxxxxx>
> Cc: kvm@xxxxxxxxxxxxxxx; Paolo Bonzini <pbonzini@xxxxxxxxxx>; Thomas
> Gleixner <tglx@xxxxxxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>; Borislav
> Petkov <bp@xxxxxxxxx>; Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>;
> x86@xxxxxxxxxx; H. Peter Anvin <hpa@xxxxxxxxx>; linux-
> kernel@xxxxxxxxxxxxxxx; Kechen Lu <kechenl@xxxxxxxxxx>
> Subject: Re: [PATCH 2/3] KVM: x86: Don't update KVM PV feature CPUID
> during vCPU running
> 
> External email: Use caution opening links or attachments
> 
> 
> +Kechen
> 
> On Thu, Mar 30, 2023, Hou Wenlong wrote:
> > __kvm_update_cpuid_runtime() may be called during vCPU running and
> KVM
> > PV feature CPUID is updated too. But the cached KVM PV feature bitmap
> > is not updated. Actually, KVM PV feature CPUID shouldn't be updated,
> > otherwise, KVM PV feature would be broken in guest. Currently, only
> > KVM_FEATURE_PV_UNHALT is updated, and it's impossible after disallow
> > disable HLT exits. However, KVM PV feature CPUID should be updated
> > only in KVM_SET_CPUID{,2} ioctl.
> >
> > Signed-off-by: Hou Wenlong <houwenlong.hwl@xxxxxxxxxxxx>
> > ---
> >  arch/x86/kvm/cpuid.c | 17 ++++++++++++-----
> >  1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index
> > 6972e0be60fa..af92d3422c79 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -222,6 +222,17 @@ static struct kvm_cpuid_entry2
> *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcp
> >                                            vcpu->arch.cpuid_nent);  }
> >
> > +static void kvm_update_pv_cpuid(struct kvm_vcpu *vcpu, struct
> kvm_cpuid_entry2 *entries,
> > +                             int nent) {
> > +     struct kvm_cpuid_entry2 *best;
> > +
> > +     best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent);
> > +     if (kvm_hlt_in_guest(vcpu->kvm) && best &&
> > +             (best->eax & (1 << KVM_FEATURE_PV_UNHALT)))
> > +             best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); }
> > +
> >  void kvm_update_pv_runtime(struct kvm_vcpu *vcpu)  {
> >       struct kvm_cpuid_entry2 *best =
> > kvm_find_kvm_cpuid_features(vcpu);
> > @@ -280,11 +291,6 @@ static void __kvm_update_cpuid_runtime(struct
> kvm_vcpu *vcpu, struct kvm_cpuid_e
> >                    cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
> >               best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
> >
> > -     best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent);
> > -     if (kvm_hlt_in_guest(vcpu->kvm) && best &&
> > -             (best->eax & (1 << KVM_FEATURE_PV_UNHALT)))
> > -             best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
> > -
> >       if (!kvm_check_has_quirk(vcpu->kvm,
> KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) {
> >               best = cpuid_entry2_find(entries, nent, 0x1,
> KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> >               if (best)
> > @@ -402,6 +408,7 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu,
> struct kvm_cpuid_entry2 *e2,
> >       int r;
> >
> >       __kvm_update_cpuid_runtime(vcpu, e2, nent);
> > +     kvm_update_pv_cpuid(vcpu, e2, nent);
> 
> Hrm, this will silently conflict with the proposed per-vCPU controls[*].
> Though arguably that patch is buggy and "needs" to toggle PV_UNHALT
> when userspace messes with HLT passthrough.  But that doesn't really make
> sense either because no guest will react kindly to
> KVM_FEATURE_PV_UNHALT disappearing.

Yes agree, toggling PV_UNHALT with per-vCPU control also sounds not making 
sense to me. And as pv feature is per VM bases, if current per-vCPU control 
touches the pv feature toggling, that would probably cause a lot of messes.

> 
> I really wish this code didn't exist, i.e. that KVM let/forced userspace deal
> with correctly defining guest CPUID.
> 
> Kechen, is it feasible for your userspace to clear PV_UNHALT when it (might)
> use the per-vCPU control?  I.e. can KVM do as this series proposes and
> update guest CPUID only on KVM_SET_CPUID{2}?  Dropping the behavior for
> the per-VM control is probably not an option as I gotta assume that'd break
> userspace, but I would really like to avoid carrying that over to the per-vCPU
> control, which would get quite messy and probably can't work anyways.

Yes, in our use cases, it's feasible to clear PV_UNHALT while using the 
per-vCPU control. I think it makes sense on userspace responsibility to clear 
the PV_UNHALT bits while trying to use the per-vCPU control for hlt passthrough.
We may add notes/requirement after this line of doc 
Documentation/virt/kvm/api.rst:
"Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits."

Best Regards,
Kechen

> 
> [*] https://lkml.kernel.org/r/20230121020738.2973-6-kechenl%40nvidia.com