Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC

Alexander Graf <agraf@xxxxxxx> · Fri, 19 Apr 2013 03:09:23 +0200

On 19.04.2013, at 02:50, Scott Wood wrote:

> On 04/18/2013 07:15:46 PM, Alexander Graf wrote:
>> On 18.04.2013, at 23:39, Scott Wood wrote:
>> > On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
>> >> +static inline int irqchip_in_kernel(struct kvm *kvm)
>> >> +{
>> >> +	int ret = 0;
>> >> +
>> >> +#ifdef CONFIG_KVM_MPIC
>> >> +	ret = ret || (kvm->arch.mpic != NULL);
>> >> +#endif
>> >> +	smp_rmb();
>> >> +	return ret;
>> >> +}
>> >
>> > Couldn't we just set a non-irqchip-specific bool?  Though eventually this
>> > shouldn't be needed at all -- instead the check would be "does the
>> > requested irqchip fd exist and refer to something that exposes an irqchip
>> > interface"?
>> How would we get the irqchip id?
> 
> I was thinking it would come from whatever operation you're trying to do, though I see that MSI routing doesn't specify an irqchip. :-P
> 
> In that case I guess it would check for whether any MSI handler has been registered.

I think you're over engineering here :). These are internal interfaces that whoever wants to implement a secondary irqchip can worry about. I merely wanted to make sure we don't block our road for multiple irqchips in the kernel<->user interface from the beginning. Internal ones are a different story :).

> 
>> > The rmb needs documentation.  I don't see a corresponding wmb before
>> > writing to kvm->arch.mpic.
>> I actually just copied it from the x86 code, wondering what the rmb() is supposed to do here. Do we need this at all?
> 
> Well, we don't want to start using the irqchip before it's been fully initialized -- but if we want to use barriers to accomplish that rather than a lock, there needs to be a wmb on the writing side.
> 
>> > How would one create a route for IPIs, timers, etc (we have had customers
>> > wanting to assign those to KVM guests)?  What about error interrupts on
>> > MPIC 4.2?  How are you defining the "pin" field for MPIC?  Shouldn't
>> "pin" is basically what a "source line" is on the MPIC. That's what the equivalent of an IOAPIC interrupt line would be for the MPIC world.
>> IPIs, timer interrupts and error interrupts are special vectors. We could probably model them as different KVM_IRQ_ROUTING_ types if we ever need to support mapping those to irqfd.
> 
> Seems a bit heavyweight to add several new MPIC-specific routing types -- maybe just one new KVM_IRQ_ROUTING type that lets multiple words be used to describe the interrupt?

Well, we can add a single KVM_IRQ_ROUTING_MPIC type if that's better. But we don't have to do it now. I dislike code that we can't test, and I don't have a good test case for user space injected IPIs right now :).

> 
>> For simple injection we can always do an ioctl on the MPIC device.
> 
> I got complaints for that originally. :-)

There are 2 reasons why direct ioctls on the MPIC device could be bad

  1) irqfd
  2) code sharing

I'm not convinced yet we care about performance for MPIC IPI, TMR or ERR interrupts. So irqfd is out of the question there.
Code sharing only makes sense in areas where things are common. In case of the MPIC, this is for SRC interrupts.

So I don't think there should be complaints here :).

> 
>> However, I'd be inclined to say that it's rather unlikely you'd want to have a vfio device's interrupt line hooked up to the IPI interrupt of your guest...
> 
> Well, as I said, we've done it before for a customer (not with VFIO of course, but our old internal-tree device assignment mechanism), so not *that* unlikely.  The host was a two-core chip running Linux on only one of the cores, and a custom OS on the other core.  They wanted to communicate between the custom OS and a KVM guest.  Since host Linux was only running on one core, it didn't need the IPIs for itself (and beginning with e500mc, the host uses msgsnd instead, so there also the host will not need the IPIs for itself).

They could just as well use a guest SRC line for that, no? What the listener on the host connects to on the host's MPIC is pretty much orthogonal to what we inject into the guest.

> 
>> > there be an API documentation update for this?
>> Hrm. I can certainly add one.
> 
> OK.  We've had enough confusion from poor documentation in the device tree binding, due to Freescale documentation calling openpic irq 16 "internal interrupt 0", that we should be clear exactly what it means.
> 
>> > We also need to document whach irqchip means, if we want to reserve the
>> > ability to use it in the future -- otherwise userspace could fill in any
>> > old junk there and we'd need to retain compatibility.  It should probably
>> > be the fd of the MPIC.
>> It can't, because irqchip is an index into an array. We'd have to add another layer of indirection if we want to find the fd to a certain irqchip. That's why I simply restricted it to always be "0" now.
> 
> OK, didn't know how firmly that was baked into the code.  Maybe a device attribute for returning the irqchip number -- which would accommodate devices that expose multiple irqchips.

That would work, yes. We'd have to have 2 mappings available

  irqchip number -> fd
  fd -> irqchip number

The first is just an array inside of kvm->arch. The second is the device attribute. We could probably even live with only the first. That gives us exactly the new layer of indirection I was talking about above.

So we know that it's possible with the API as we have it. It only needs to be extended by an ioctl to declare which fd which irqchip number belongs to. By default, id 0 is mapped to the first created PIC. That way we stay backwards compatible, but allow for multiple irqchips.

> 
>> > It looks like you only support having one type of irqchip built into
>> > the kernel at a time?  That may be OK for now given the existing
>> > restrictions for building KVM, but it should be noted as a
>> > limitation/TODO.
>> I don't think it's really hard to add support for another irqchip type. But we certainly have harder issues to tackle first before we end up in a situation where in-kernel MPIC and in-kernel XICS would make sense in the same kernel.
> 
> Not necessarily hard or hugely important, just looked odd putting global non-MPIC-specific functions in the MPIC file.  I tend to prefer getting the component separation (at least resaonably) correct from the start, so the person who would end up needing to refactor to fit their device in doesn't need to mess with MPIC code to do so.  Such a person might be unfamiliar with MPIC, have no easy way to test, etc.  Similar to the current situation with the IRQ routing code, at least before you took it over. :-)

The only person interested in generalizing that code for now would be Paul. He knows the MPIC quite well, so I doubt he'll have issues with it :).

All it takes really are simple multiplexing functions in front of the existing callbacks that call either MPIC or XICS code depending on the target irqchip.

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html