On 06.03.2013, at 13:14, Paolo Bonzini wrote: > >>>>>>> So what is the difference between calling this special ioctl >>>>>>> before >>>>>>> creating vcpus and calling create device ioctl instead and >>>>>>> create >>>>>>> QEMU proxy device at whatever point in time QEMU wants to >>>>>>> create >>>>>>> it? >>>>>> >>>>>> Because you'd have to stash the handle that KVM_CREATE_DEVICE >>>>>> returns somewhere, waiting for the QEMU device to be created. >>>>> >>>>> OK, we try not to add interfaces for one userspace convenience >>>>> though. Is this such insurmountable problem for QEMU? >>>> >>>> Nothing is insurmountable. However, forcing a particular order >>>> of device creation is not very nice on userspace. If the >>>> hypervisor >>>> wants to do that, it can do userspace the favor of keeping the id >>>> in kernel. :) >>>> >>>>>> Perhaps it's just a problem of naming, and KVM_CREATE_DEVICE is >>>>>> simply >>>>>> not the right name for the interface. Once both >>>>>> KVM_CREATE_IRQCHIP_ARGS >>>>>> and KVM_CREATE_DEVICE are added, it really will not create the >>>>>> device anymore. >>>>>> Devices will be created by KVM_CREATE_IRQCHIP_ARGS, and >>>>>> possibly by >>>>>> KVM_CREATE_VCPU. KVM_CREATE_DEVICE is really only returning an >>>>>> id. >>>>>> >>>>>> So we can have this instead: >>>>>> - KVM_CREATE_IRQCHIP_ARGS becomes KVM_SET_IRQCHIP_TYPE (and >>>>>> "none" >>>>>> can be a valid irqchip type). >>>>>> >>>>>> - KVM_CREATE_DEVICE becomes KVM_GET_IRQCHIP_DEVICE, and you >>>>>> pass it >>>>>> a device type and possibly a VCPU number. >>>>>> >>>>>> It's mostly about names, but one important property is that >>>>>> KVM_GET_IRQCHIP_DEVICE can be called at any time and, in fact, >>>>>> multiple times. Gleb, do you like this more? >>>>> >>>>> If you put it like this it sounds better (well you've just >>>>> stashed >>>>> the handle in kernel for QEMU convenience :)), but you've made >>>>> the >>>>> interface irqchips specific again and this is what we are trying >>>>> to avoid. >>>> >>>> Yes, KVM_GET_IRQCHIP_DEVICE is specific to irqchips because >>>> (following >>>> the model of x86) the irqchip type is chosen before creating >>>> VCPUs. >>>> I don't see an alternative unless we stop having irqchip as an >>>> all-or-nothing choice. >>>> >>>> I'm not saying KVM_CREATE_DEVICE is a bad interface, but I'm not >>>> sure it is really what is needed in this case. KVM_CREATE_DEVICE >>>> would be perfect as a replacement for KVM_CREATE_PIT2, for >>>> example. >>>> But in this case creating a device is not what we're really >>>> doing; >>>> the creation is done magically by the hypervisor by virtue of >>>> the previous KVM_CREATE_IRQCHIP_ARGS. >>> >>> No, it's not and it shouldn't be. To speak in x86 terms: >>> >>> KVM_SET_IRQCHIP_TYPE spawns LAPICs (indirectly, they only get >>> spawned on vcpu creation) >>> KVM_CREATE_DEVICE spawns IOAPICs. > > Ok, that makes sense. > >> Agree. Lumping up in-kernel LAPIC and IRQCHIPS under one in-kernel >> irqchip umbrella was a mistake on x86. The one we should not force on >> others. > > Alex, would the PPC patches let you run with in-kernel "LAPICs" > and userspace "IOAPICs"? If so, the new model would not be a > problem with QEMU at all. The split on PPC isn't that clean. The MPIC doesn't split it at all for example. There we only have an "IOAPIC" without a "LAPIC". So setting the irqchip type to MPIC would be a nop. For XICS, we would have something similar to a LAPIC. We would however have to communicate with that piece to tell it that interrupts are pending or not. I suppose this might be doable through the ONE_REG interface that Paul implemented, but I'm not sure. I don't really think doing such a split makes sense though :). > The problem would only start if KVM_SET_IRQCHIP_TYPE (new name of > KVM_CREATE_IRQCHIP_ARGS) forced you to later call KVM_CREATE_DEVICE. Ah, I see. I don't see why it would. The fact that there is a "LAPIC" doesn't mean that the per-vcpu SET_INTERRUPT ioctl stops working. So if SET_IRQCHIP_TYPE(!none) breaks user-space interrupt controller emulation I would consider that a bug. Alex > > Paolo > -- > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html