Re: in-kernel interrupt controller steering

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06.03.2013, at 13:14, Paolo Bonzini wrote:

> 
>>>>>>> So what is the difference between calling this special ioctl
>>>>>>> before
>>>>>>> creating vcpus and calling create device ioctl instead and
>>>>>>> create
>>>>>>> QEMU proxy device at whatever point in time QEMU wants to
>>>>>>> create
>>>>>>> it?
>>>>>> 
>>>>>> Because you'd have to stash the handle that KVM_CREATE_DEVICE
>>>>>> returns somewhere, waiting for the QEMU device to be created.
>>>>> 
>>>>> OK, we try not to add interfaces for one userspace convenience
>>>>> though. Is this such insurmountable problem for QEMU?
>>>> 
>>>> Nothing is insurmountable.  However, forcing a particular order
>>>> of device creation is not very nice on userspace.  If the
>>>> hypervisor
>>>> wants to do that, it can do userspace the favor of keeping the id
>>>> in kernel.  :)
>>>> 
>>>>>> Perhaps it's just a problem of naming, and KVM_CREATE_DEVICE is
>>>>>> simply
>>>>>> not the right name for the interface.  Once both
>>>>>> KVM_CREATE_IRQCHIP_ARGS
>>>>>> and KVM_CREATE_DEVICE are added, it really will not create the
>>>>>> device anymore.
>>>>>> Devices will be created by KVM_CREATE_IRQCHIP_ARGS, and
>>>>>> possibly by
>>>>>> KVM_CREATE_VCPU.  KVM_CREATE_DEVICE is really only returning an
>>>>>> id.
>>>>>> 
>>>>>> So we can have this instead:
>>>>>> - KVM_CREATE_IRQCHIP_ARGS becomes KVM_SET_IRQCHIP_TYPE (and
>>>>>> "none"
>>>>>> can be a valid irqchip type).
>>>>>> 
>>>>>> - KVM_CREATE_DEVICE becomes KVM_GET_IRQCHIP_DEVICE, and you
>>>>>> pass it
>>>>>> a device type and possibly a VCPU number.
>>>>>> 
>>>>>> It's mostly about names, but one important property is that
>>>>>> KVM_GET_IRQCHIP_DEVICE can be called at any time and, in fact,
>>>>>> multiple times.  Gleb, do you like this more?
>>>>> 
>>>>> If you put it like this it sounds better (well you've just
>>>>> stashed
>>>>> the handle in kernel for QEMU convenience :)), but you've made
>>>>> the
>>>>> interface irqchips specific again and this is what we are trying
>>>>> to avoid.
>>>> 
>>>> Yes, KVM_GET_IRQCHIP_DEVICE is specific to irqchips because
>>>> (following
>>>> the model of x86) the irqchip type is chosen before creating
>>>> VCPUs.
>>>> I don't see an alternative unless we stop having irqchip as an
>>>> all-or-nothing choice.
>>>> 
>>>> I'm not saying KVM_CREATE_DEVICE is a bad interface, but I'm not
>>>> sure it is really what is needed in this case.  KVM_CREATE_DEVICE
>>>> would be perfect as a replacement for KVM_CREATE_PIT2, for
>>>> example.
>>>> But in this case creating a device is not what we're really
>>>> doing;
>>>> the creation is done magically by the hypervisor by virtue of
>>>> the previous KVM_CREATE_IRQCHIP_ARGS.
>>> 
>>> No, it's not and it shouldn't be. To speak in x86 terms:
>>> 
>>>  KVM_SET_IRQCHIP_TYPE spawns LAPICs (indirectly, they only get
>>>  spawned on vcpu creation)
>>>  KVM_CREATE_DEVICE spawns IOAPICs.
> 
> Ok, that makes sense.
> 
>> Agree. Lumping up in-kernel LAPIC and IRQCHIPS under one in-kernel
>> irqchip umbrella was a mistake on x86. The one we should not force on
>> others.
> 
> Alex, would the PPC patches let you run with in-kernel "LAPICs"
> and userspace "IOAPICs"?  If so, the new model would not be a
> problem with QEMU at all.

The split on PPC isn't that clean. The MPIC doesn't split it at all for example. There we only have an "IOAPIC" without a "LAPIC". So setting the irqchip type to MPIC would be a nop.

For XICS, we would have something similar to a LAPIC. We would however have to communicate with that piece to tell it that interrupts are pending or not. I suppose this might be doable through the ONE_REG interface that Paul implemented, but I'm not sure.

I don't really think doing such a split makes sense though :).

> The problem would only start if KVM_SET_IRQCHIP_TYPE (new name of
> KVM_CREATE_IRQCHIP_ARGS) forced you to later call KVM_CREATE_DEVICE.

Ah, I see. I don't see why it would. The fact that there is a "LAPIC" doesn't mean that the per-vcpu SET_INTERRUPT ioctl stops working. So if SET_IRQCHIP_TYPE(!none) breaks user-space interrupt controller emulation I would consider that a bug.


Alex

> 
> Paolo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux