Hi Sean, > -----Original Message----- > From: Sean Christopherson <seanjc@xxxxxxxxxx> > Sent: Wednesday, April 5, 2023 8:29 PM > To: Hou Wenlong <houwenlong.hwl@xxxxxxxxxxxx> > Cc: kvm@xxxxxxxxxxxxxxx; Paolo Bonzini <pbonzini@xxxxxxxxxx>; Thomas > Gleixner <tglx@xxxxxxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>; Borislav > Petkov <bp@xxxxxxxxx>; Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>; > x86@xxxxxxxxxx; H. Peter Anvin <hpa@xxxxxxxxx>; linux- > kernel@xxxxxxxxxxxxxxx; Kechen Lu <kechenl@xxxxxxxxxx> > Subject: Re: [PATCH 2/3] KVM: x86: Don't update KVM PV feature CPUID > during vCPU running > > External email: Use caution opening links or attachments > > > +Kechen > > On Thu, Mar 30, 2023, Hou Wenlong wrote: > > __kvm_update_cpuid_runtime() may be called during vCPU running and > KVM > > PV feature CPUID is updated too. But the cached KVM PV feature bitmap > > is not updated. Actually, KVM PV feature CPUID shouldn't be updated, > > otherwise, KVM PV feature would be broken in guest. Currently, only > > KVM_FEATURE_PV_UNHALT is updated, and it's impossible after disallow > > disable HLT exits. However, KVM PV feature CPUID should be updated > > only in KVM_SET_CPUID{,2} ioctl. > > > > Signed-off-by: Hou Wenlong <houwenlong.hwl@xxxxxxxxxxxx> > > --- > > arch/x86/kvm/cpuid.c | 17 ++++++++++++----- > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index > > 6972e0be60fa..af92d3422c79 100644 > > --- a/arch/x86/kvm/cpuid.c > > +++ b/arch/x86/kvm/cpuid.c > > @@ -222,6 +222,17 @@ static struct kvm_cpuid_entry2 > *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcp > > vcpu->arch.cpuid_nent); } > > > > +static void kvm_update_pv_cpuid(struct kvm_vcpu *vcpu, struct > kvm_cpuid_entry2 *entries, > > + int nent) { > > + struct kvm_cpuid_entry2 *best; > > + > > + best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); > > + if (kvm_hlt_in_guest(vcpu->kvm) && best && > > + (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) > > + best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); } > > + > > void kvm_update_pv_runtime(struct kvm_vcpu *vcpu) { > > struct kvm_cpuid_entry2 *best = > > kvm_find_kvm_cpuid_features(vcpu); > > @@ -280,11 +291,6 @@ static void __kvm_update_cpuid_runtime(struct > kvm_vcpu *vcpu, struct kvm_cpuid_e > > cpuid_entry_has(best, X86_FEATURE_XSAVEC))) > > best->ebx = xstate_required_size(vcpu->arch.xcr0, true); > > > > - best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); > > - if (kvm_hlt_in_guest(vcpu->kvm) && best && > > - (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) > > - best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); > > - > > if (!kvm_check_has_quirk(vcpu->kvm, > KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) { > > best = cpuid_entry2_find(entries, nent, 0x1, > KVM_CPUID_INDEX_NOT_SIGNIFICANT); > > if (best) > > @@ -402,6 +408,7 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, > struct kvm_cpuid_entry2 *e2, > > int r; > > > > __kvm_update_cpuid_runtime(vcpu, e2, nent); > > + kvm_update_pv_cpuid(vcpu, e2, nent); > > Hrm, this will silently conflict with the proposed per-vCPU controls[*]. > Though arguably that patch is buggy and "needs" to toggle PV_UNHALT > when userspace messes with HLT passthrough. But that doesn't really make > sense either because no guest will react kindly to > KVM_FEATURE_PV_UNHALT disappearing. Yes agree, toggling PV_UNHALT with per-vCPU control also sounds not making sense to me. And as pv feature is per VM bases, if current per-vCPU control touches the pv feature toggling, that would probably cause a lot of messes. > > I really wish this code didn't exist, i.e. that KVM let/forced userspace deal > with correctly defining guest CPUID. > > Kechen, is it feasible for your userspace to clear PV_UNHALT when it (might) > use the per-vCPU control? I.e. can KVM do as this series proposes and > update guest CPUID only on KVM_SET_CPUID{2}? Dropping the behavior for > the per-VM control is probably not an option as I gotta assume that'd break > userspace, but I would really like to avoid carrying that over to the per-vCPU > control, which would get quite messy and probably can't work anyways. Yes, in our use cases, it's feasible to clear PV_UNHALT while using the per-vCPU control. I think it makes sense on userspace responsibility to clear the PV_UNHALT bits while trying to use the per-vCPU control for hlt passthrough. We may add notes/requirement after this line of doc Documentation/virt/kvm/api.rst: "Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits." Best Regards, Kechen > > [*] https://lkml.kernel.org/r/20230121020738.2973-6-kechenl%40nvidia.com