On Fri, 2022-03-11 at 21:28 +0800, Zeng Guang wrote: > > On 3/11/2022 12:26 PM, Sean Christopherson wrote: > > On Wed, Mar 09, 2022, Maxim Levitsky wrote: > > > On Wed, 2022-03-09 at 06:01 +0000, Sean Christopherson wrote: > > > > > Could you share the links? > > > > > > > > Doh, sorry (they're both in this one). > > > > > > > > https://lore.kernel.org/all/20220301135526.136554-5-mlevitsk@xxxxxxxxxx > > > > > > > > > > > > > > My opinion on this subject is very simple: we need to draw the line somewhere. > > > > ... > > > > > > Since the goal is to simplify KVM, can we try the inhibit route and see what the > > code looks like before making a decision? I think it might actually yield a less > > awful KVM than the readonly approach, especially if the inhibit is "sticky", i.e. > > we don't try to remove the inhibit on subsequent changes. > > > > Killing the VM, as proposed, is very user unfriendly as the user will have no idea > > why the VM was killed. WARN is out of the question because this is user triggerable. > > Returning an emulation error would be ideal, but getting that result up through > > apic_mmio_write() could be annoying and end up being more complex. > > > > The touchpoints will all be the same, unless I'm missing something the difference > > should only be a call to set an inhibit instead killing the VM. > > Introduce an inhibition - APICV_INHIBIT_REASON_APICID_CHG to deactivate > APICv once KVM guest would try to change APIC ID in xapic mode, and same > sanity check in KVM_{SET,GET}_LAPIC for live migration. KVM will keep > alive but obviously lose benefit from hardware acceleration in this way. > > So how do you think the proposal like this ? > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 6dcccb304775..30d825c069be 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1046,6 +1046,7 @@ struct kvm_x86_msr_filter { > #define APICV_INHIBIT_REASON_X2APIC 5 > #define APICV_INHIBIT_REASON_BLOCKIRQ 6 > #define APICV_INHIBIT_REASON_ABSENT 7 > +#define APICV_INHIBIT_REASON_APICID_CHG 8 > > struct kvm_arch { > unsigned long n_used_mmu_pages; > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index 22929b5b3f9b..66cd54fa4515 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -2044,10 +2044,19 @@ static int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) > > switch (reg) { > case APIC_ID: /* Local APIC ID */ > - if (!apic_x2apic_mode(apic)) > - kvm_apic_set_xapic_id(apic, val >> 24); > - else > + if (apic_x2apic_mode(apic)) { > ret = 1; > + break; > + } > + /* > + * If changing APIC ID with any APIC acceleration enabled, > + * deactivate APICv to avoid unexpected issues. > + */ > + if (enable_apicv && (val >> 24) != apic->vcpu->vcpu_id) > + kvm_request_apicv_update(apic->vcpu->kvm, > + false, APICV_INHIBIT_REASON_APICID_CHG); > + > + kvm_apic_set_xapic_id(apic, val >> 24); > break; > > case APIC_TASKPRI: > @@ -2628,11 +2637,19 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu) > static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu, > struct kvm_lapic_state *s, bool set) > { > - if (apic_x2apic_mode(vcpu->arch.apic)) { > - u32 *id = (u32 *)(s->regs + APIC_ID); > - u32 *ldr = (u32 *)(s->regs + APIC_LDR); > - u64 icr; > + u32 *id = (u32 *)(s->regs + APIC_ID); > + u32 *ldr = (u32 *)(s->regs + APIC_LDR); > + u64 icr; > + if (!apic_x2apic_mode(vcpu->arch.apic)) { > + /* > + * If APIC ID changed with any APIC acceleration enabled, > + * deactivate APICv to avoid unexpected issues. > + */ > + if (enable_apicv && (*id >> 24) != vcpu->vcpu_id) > + kvm_request_apicv_update(vcpu->kvm, > + false, APICV_INHIBIT_REASON_APICID_CHG); > + } else { > if (vcpu->kvm->arch.x2apic_format) { > if (*id != vcpu->vcpu_id) > return -EINVAL; > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c > index 82d56f8055de..f78754bdc1d0 100644 > --- a/arch/x86/kvm/svm/avic.c > +++ b/arch/x86/kvm/svm/avic.c > @@ -931,7 +931,8 @@ bool svm_check_apicv_inhibit_reasons(ulong bit) > BIT(APICV_INHIBIT_REASON_IRQWIN) | > BIT(APICV_INHIBIT_REASON_PIT_REINJ) | > BIT(APICV_INHIBIT_REASON_X2APIC) | > - BIT(APICV_INHIBIT_REASON_BLOCKIRQ); > + BIT(APICV_INHIBIT_REASON_BLOCKIRQ) | > + BIT(APICV_INHIBIT_REASON_APICID_CHG); > > return supported & BIT(bit); > } > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 7beba7a9f247..91265f0784bd 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -7751,7 +7751,8 @@ static bool vmx_check_apicv_inhibit_reasons(ulong bit) > ulong supported = BIT(APICV_INHIBIT_REASON_DISABLE) | > BIT(APICV_INHIBIT_REASON_ABSENT) | > BIT(APICV_INHIBIT_REASON_HYPERV) | > - BIT(APICV_INHIBIT_REASON_BLOCKIRQ); > + BIT(APICV_INHIBIT_REASON_BLOCKIRQ) | > + BIT(APICV_INHIBIT_REASON_APICID_CHG); > > return supported & BIT(bit); > } > > > This won't work with nested AVIC - we can't just inhibit a nested guest using its own AVIC, because migration happens. Best regards, Maxim Levitsky