Re: [PATCH 3/3] KVM: x86: Decouple APICv activation state from apicv_inhibit_reasons

Naveen N Rao <naveen@xxxxxxxxxx> · Tue, 11 Feb 2025 20:07:02 +0530

On Tue, Feb 04, 2025 at 03:08:57PM +0100, Paolo Bonzini wrote:
> On Tue, Feb 4, 2025 at 12:15 PM Naveen N Rao <naveen@xxxxxxxxxx> wrote:
> > As a separate change, I have been testing a patch that moves the
> > PIT_REINJ inhibit from PIT creation to the point at which the guest
> > actually programs it so that default guest configurations can utilize
> > AVIC:
> 
> In-kernel PIC and PIT is sort of a legacy setup; the so-called
> "split irqchip" (LAPIC in KVM, PIC/PIT in userspace) is strongly
> preferred.  So I don't think it's particularly important to cater
> for PIT_REINJ.

Sure, though it would be nice if we can enable AVIC to function in wider 
configurations especially if the guest doesn't use the PIT :)

> 
> > If it is, or if we choose to delay PIT_REINJ inhibit to vcpu creation time,
> > then making PT_REINJ or IRQWIN inhibits sticky will prevent AVIC from being
> > enabled later on. I can see in my tests that BIOS (both seabios and edk2)
> > programs the PIT though Linux guest itself doesn't (unless -no-hpet is used).
> 
> Even with -no-hpet, Linux should turn off the PIT relatively soon
> and only rely on the local APIC's timer.

I am not seeing that. With -no-hpet, I see that the guest continues to 
use the PIT, as well as the local APIC timer.

> 
> > You're right -- APICv isn't actually being toggled, but IRQWIN inhibit is
> > constantly being set and cleared while trying to inject device interrupts into
> > the guests. The places where we set/clear IRQWIN inhibit has comments
> > indicating that it is only required for ExtINT, though that's not actually the
> > case here.
> >
> > What is actually happening is that since the PIT is in reinject mode, APICv is
> > not active in the guest. When that happens, kvm_cpu_has_injectable_intr()
> > returns true when any interrupt is pending:
> >
> >     /*
> >      * check if there is injectable interrupt:
> >      * when virtual interrupt delivery enabled,
> >      * interrupt from apic will handled by hardware,
> >      * we don't need to check it here.
> >      */
> >     int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v)
> >     {
> >             if (kvm_cpu_has_extint(v))
> >                     return 1;
> >
> >             if (!is_guest_mode(v) && kvm_vcpu_apicv_active(v))
> >                     return 0;
> >
> >             return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
> >     }
> >
> > The second if condition fails since APICv is not active. So,
> > kvm_check_and_inject_events() calls enable_irq_window() to request for an IRQ
> > window to inject those interrupts.
> 
> Ok, that's due solely to the presence of *another* active inhibit.
> Since sticky inhibits cannot work, making the IRQWIN inhibit per-CPU
> will still cause vCPUs to pound on the apicv_update_lock, but only on
> the read side of the rwsem so that should be more tolerable.
> 
> Using atomics is considerably more complicated and I'd rather avoid it.

Understood, thanks!

- Naveen