Re: [PATCH] kvm: Move kvm_allows_irq0_override() to target-i386

Jan Kiszka <jan.kiszka@xxxxxx> · Sat, 21 Jul 2012 11:44:34 +0200

On 2012-07-21 11:30, Peter Maydell wrote:
> On 21 July 2012 10:14, Jan Kiszka <jan.kiszka@xxxxxx> wrote:
>> On 2012-07-21 10:54, Peter Maydell wrote:
>>> On 21 July 2012 07:57, Jan Kiszka <jan.kiszka@xxxxxx> wrote:
>>>> On 2012-07-20 21:14, Peter Maydell wrote:
>>>>> I'm sure this isn't the only x86ism in the KVM generic source
>>>>> files. However the thing I'm specifically trying to do is
>>>>> nuke all the uses of kvm_irqchip_in_kernel() in common code,
>>>>
>>>> No, "irqchip in kernel" is supposed to be a generic concept. We will
>>>> also have it on Power. Not sure what your plans are for ARM, maybe it
>>>> will always be true there.
>>>
>>> I agree that "irqchip in kernel?" is generic (though as you'll see
>>> below there's disagreement about what that ought to mean or imply).
>>> "irq0_override" though seems to me to be absolutely x86 specific.
>>
>> Naming is x86 specific, semantic not. It means that KVM doesn't prevent
>> remapping of IRQs. Granted, I really hope you don't make such mistakes
>> in your arch.
> 
> What does "remapping of IRQs" mean here? This is still sounding

It means that the QEMU model of the board can define interrupt routes in
an unconfined way, which is obviously always true when the irqchips are
all in userspace but not necessarily when KVM support is in the loop.

> not very generic to me, in that I really don't know what it would
> mean in an ARM context. The fact that the only caller of this is
> in hw/pc.c is also a big red flag that this isn't exactly generic.

x86 is also still the only arch with full in-kernel irqchip support. And
even if there is only one arch using it, that doesn't mean the test
needs to be moved around - if the test itself is generic, just always
true for other archs.

> 
>>>> That said, maybe there is room for discussion about what it means for
>>>> the general KVM code and its users if the irqchip is in the kernel. Two
>>>> things that should be common for every arch:
>>>>  - VCPU idle management is done inside the kernel
>>>
>>> The trouble is that at the moment QEMU assumes that "is the
>>> irqchip in kernel?" == "is VCPU idle management in kernel", for
>>> instance. For ARM, VCPU idle management is in kernel whether
>>> we're using the kernel's model of the VGIC or not. Alex tells
>>> me PPC is the same way. It's only x86 that has tied these two
>>> concepts together.
>>
>> Hmm, and why does Power work despite this mismatch?
> 
> I think because hw/ppc.c:ppc_set_irq() both calls cpu_interrupt()
> and also kvmppc_set_interrupt(), so we end up with a sort of odd
> mix of both models... Alex?
> 
>> If cpu_thread_is_idle doesn't work for you, define something like
>> kvm_idle_in_kernel() to replace kvm_irqchip_in_kernel here.
> 
> Yes, this is kind of where I'm headed. I thought I'd start with this
> patch as the easiest one first, though...

Before moving anything, let's refine/break up the semantics of
kvm_irqchip_in_kernel first.

> 
>>> The reason I want to get rid of common-code uses of kvm_irqchip_in_kernel()
>>> is because I think they're all similar to this -- the common code is
>>> using the check as a proxy for something else, and it should be directly
>>> asking about that something else. The only bits of code that should
>>> care about "is the irqchip in kernel?" are:
>>>  * target-specific device/machine setup code which needs to know
>>>    which apic/etc to instantiate
>>>  * target-specific x86 code which has this weird synchronous IRQ
>>>    delivery model for irqchip-not-in-kernel
>>> (Obviously I might have missed something, I'm flailing around
>>> trying to understand this code :-))
>>>
>>>>  - in-kernel KVM helpers like vhost or VFIO can inject IRQs directly
>>>>
>>>> The latter point implies that irqfd is available and that interrupt
>>>> routes from virtual IRQs (*) (like the one associated with an irqfd) to
>>>> the in-kernel IRQ controller have to be established. That's pretty generic.
>>>
>>> But you can perfectly well have an in-kernel-irqchip that doesn't
>>> support irqfd
>>
>> You could, thought this doesn't make much sense.
> 
> Why doesn't it make sense? On ARM, in-kernel-irqchip means you can take
> advantage of the hardware support for a virtual GIC, and you can use
> the virtual timer support too. These are both big performance advantages
> even if QEMU never does anything with irqfds. (In fact the current
> ARM KVM VGIC code doesn't support irqfds as far as I can see from
> a quick scan of the kernel code.)

It doesn't make sense as it means your in-kernel irqchip model is
semi-finished. If you didn't consider how to support direct in-kernel
IRQ injections, you risk designing something that requires userspace
quirk handling later on when extending it to full-featured in-kernel
irqchip support. Of course, you are free to do this stepwise, but your
digging through QEMU /wrt x86-specifics in the KVM layer should warn you
about the risks. ;)

Jan

Attachment:
signature.asc

Description: OpenPGP digital signature