Re: [RFC PATCH 1/6] kvm: add device control API

Alexander Graf <agraf@xxxxxxx> · Mon, 25 Feb 2013 16:29:39 +0100

On 25.02.2013, at 14:09, Gleb Natapov wrote:

> On Mon, Feb 25, 2013 at 12:11:19PM +1100, Paul Mackerras wrote:
>> On Mon, Feb 18, 2013 at 02:21:59PM +0200, Gleb Natapov wrote:
>>> Copying Christoffer since ARM has in kernel irq chip too.
>>> 
>>> On Wed, Feb 13, 2013 at 11:49:15PM -0600, Scott Wood wrote:
>>>> Currently, devices that are emulated inside KVM are configured in a
>>>> hardcoded manner based on an assumption that any given architecture
>>>> only has one way to do it.  If there's any need to access device state,
>>>> it is done through inflexible one-purpose-only IOCTLs (e.g.
>>>> KVM_GET/SET_LAPIC).  Defining new IOCTLs for every little thing is
>>>> cumbersome and depletes a limited numberspace.
>>>> 
>>>> This API provides a mechanism to instantiate a device of a certain
>>>> type, returning an ID that can be used to set/get attributes of the
>>>> device.  Attributes may include configuration parameters (e.g.
>>>> register base address), device state, operational commands, etc.  It
>>>> is similar to the ONE_REG API, except that it acts on devices rather
>>>> than vcpus.
>>> You are not only provide different way to create in kernel irq chip you
>>> also use an alternate way to trigger interrupt lines. Before going into
>>> interface specifics lets think about whether it is really worth it? x86
>>> obviously support old way and will have to for some, very long, time.
>>> ARM vGIC code, that is ready to go upstream, uses old way too. So it will
>>> be 2 archs against one. Christoffer do you think the proposed way it
>>> better for your needs. Are you willing to make vGIC use it?
>> 
>> In fact there have been two distinct interrupt controller emulations
>> for PPC posted recently, Scott's and mine, with quite different
>> interfaces.  In contrast to Scott's device control API, where the
>> operations you would do for different interrupt controllers are quite
>> different, mine attempted to define a much more abstract interface for
>> interrupt controller emulations, with the idea that an interrupt
>> controller subsystem connects a set of interrupt sources, each with
>> some state, to a set of interrupt delivery mechanisms, one per vcpu,
>> also with some state.
>> 
> I agree with Scott that it is futile to try and come up with generic
> irqchip configuration interface and I doubt it is needed from QEMU
> or other userspace pov. I looked at proposed KVM_IRQCHIP_SET_SOURCES
> interface and while it is possible to pass some information about
> pic/ioapic using it there will be a lot of information that will not
> fit there. For one there is global irqchips related state and proposed
> interface only talk about interrupt sources. Another is that using
> generic interface will require us to have a code that translate irqchip
> representation into this generic one and back for no apparent gain.
> Currently pic/ioapic state is very similar to what HW specification
> defines and it is not going to change.
> 
> Looking at your implementation of KVM_IRQCHIP_SET_SOURCES I wounder how
> well it work for migration though. The interface suppose to transfer
> irqchips state as is, but I see things like that in your code:
> 
>                       mutex_lock(&ics->lock);
>                       irqp->server = val & KVM_IRQ_SERVER_MASK;
>                       irqp->priority = val >> KVM_IRQ_PRIORITY_SHIFT;
>                       irqp->resend = 0;
>                       irqp->masked_pending = 0;
>                       irqp->asserted = 0;
> 
> Why it is safe to initialize those values to default values during
> migration? Wouldn't it be simpler and more correct to just transfer
> the whole content of irqp from src to dst?
> 
>> Thus my interface had:
>> 
>> - a KVM_CREATE_IRQCHIP_ARGS ioctl, with an argument structure that
>>  indicates the overall architecture of the interrupt subsystem,
>> 
>> - KVM_IRQCHIP_SET_SOURCES and KVM_IRQCHIP_GET_SOURCES ioctls, that
>>  return or modify the state of some set of interrupt sources
>> 
>> - a KVM_REG_PPC_ICP_STATE identifier in the ONE_REG register
>>  identifier space, that is used to save and restore the per-vcpu
>>  interrupt presentation state.
>> 
>> Alternatively, I could modify my code to use the existing
>> KVM_CREATE_IRQCHIP, KVM_GET_IRQCHIP, and KVM_SET_IRQCHIP ioctls.  If I
>> were to do that I would want to rename the 'pad' field in struct
>> kvm_irqchip to 'nr' and use it with 'chip_id' to identify a range of
>> interrupt sources to be affected.  I'd also want to keep the ONE_REG
>> identifier for the per-vcpu state.
> It is preferable to the interface you propose since I do not think your
> interface fits other interrupt controllers well. You can put nr field
> into dummy[] payload, or replace pad with "union {pad, nr}".
> 
>> 
>> Or I could change over to using Scott's device control API, though I
>> have some objections to doing that.
>> 
>> What is your advice about which interface to use?
>> 
> The ideal situation would be if you and Scott agree on the details about
> the interface. If you don't like something about Scott's interface we
> can discuss it and shape it to something you agree with or even like.
> I already asked Scott to introduce command interface. Scott does not
> care about migration, you do, so you can make sure that interface works
> for that.

I agree. I really want to see one style used for both devices. If we come to the conclusion that we need to go old-style ioctl payload style, then fine. I obviously prefer fine-grained individual state accessors.

But really, even if the "generic interface" only generalizes MPIC and XICS it's already a win for me :).

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html