On 8/10/22 7:17 PM, Dong, Eddie wrote: >>> >>> >>>> However, with KVM + vfio (or whatever is listening on the resamplefd) >>>> we don't check that the interrupt is still masked in the guest at the moment >> of EOI. >>>> Resamplefd is notified regardless, so vfio prematurely unmasks the >>>> host physical IRQ, thus a new (unwanted) physical interrupt is >>>> generated in the host and queued for injection to the guest." >>>> >>> >>> Emulation of level triggered IRQ is a pain point ☹ I read we need to >>> emulate the "level" of the IRQ pin (connecting from device to IRQchip, i.e. >> ioapic here). >>> Technically, the guest can change the polarity of vIOAPIC, which will >>> lead to a new virtual IRQ even w/o host side interrupt. >> >> Thanks, interesting point. Do you mean that this behavior (a new vIRQ as a >> result of polarity change) may already happen with the existing KVM code? >> >> It doesn't seem so to me. AFAICT, KVM completely ignores the vIOAPIC polarity >> bit, in particular it doesn't handle change of the polarity by the guest (i.e. >> doesn't update the virtual IRR register, and so on), so it shouldn't result in a >> new interrupt. > > Correct, KVM doesn't handle polarity now. Probably because unlikely the commercial OSes > will change polarity. > >> >> Since commit 100943c54e09 ("kvm: x86: ignore ioapic polarity") there seems to >> be an assumption that KVM interpretes the IRQ level value as active (asserted) >> vs inactive (deasserted) rather than high vs low, i.e. > > Asserted/deasserted vs. high/low is same to me, though asserted/deasserted hints more for event rather than state. > >> the polarity doesn't matter to KVM. >> >> So, since both sides (KVM emulating the IOAPIC, and vfio/whatever emulating >> an external interrupt source) seem to operate on a level of abstraction of >> "asserted" vs "de-asserted" interrupt state regardless of the polarity, and that >> seems not a bug but a feature, it seems that we don't need to emulate the IRQ >> level as such. Or am I missing something? > > YES, I know current KVM doesn't handle it. Whether we should support it is another story which I cannot speak for. > Paolo and Alex are the right person 😊 > The reason I mention this is because the complexity to adding a pending event vs. supporting a interrupt pin state is same. > I am wondering if we need to revisit it or not. Behavior closing to real hardware helps us to avoid potential issues IMO, but I am fine to either choice. I guess that would imply revisiting KVM irqfd interface, since its design is based rather on events than states, even for level-triggered interrupts: - trigger event (from vfio to KVM) to assert an IRQ - resample event (from KVM to vfio) to de-assert an IRQ > >> >> OTOH, I guess this means that the existing KVM's emulation of level-triggered >> interrupts is somewhat limited (a guest may legitimately expect an interrupt >> fired as a result of polarity change, and that case is not supported by KVM). But >> that is rather out of scope of the oneshot interrupts issue addressed by this >> patchset. > > Agree. > I didn't know any commercial OSes change polarity either. But I know Xen hypervisor uses polarity under certain condition. > One day, we may see the issue when running Xen as a L1 hypervisor. But this is not the current worry. > > >> >>> "pending" field of kvm_kernel_irqfd_resampler in patch 3 means more an >> event rather than an interrupt level. > > I know. I am fine either. > > Thanks Eddie > >>> >>>