Re: [PATCH 15/17] KVM: PPC: Support irq routing and irqfd for in-kernel MPIC

Scott Wood <scottwood@xxxxxxxxxxxxx> · Thu, 18 Apr 2013 20:37:17 -0500

On 04/18/2013 08:09:23 PM, Alexander Graf wrote:

On 19.04.2013, at 02:50, Scott Wood wrote:

> On 04/18/2013 07:15:46 PM, Alexander Graf wrote:
>> On 18.04.2013, at 23:39, Scott Wood wrote:
>> > On 04/18/2013 09:11:55 AM, Alexander Graf wrote:
>> >> +static inline int irqchip_in_kernel(struct kvm *kvm)
>> >> +{
>> >> +	int ret = 0;
>> >> +
>> >> +#ifdef CONFIG_KVM_MPIC
>> >> +	ret = ret || (kvm->arch.mpic != NULL);
>> >> +#endif
>> >> +	smp_rmb();
>> >> +	return ret;
>> >> +}
>> >
>> > Couldn't we just set a non-irqchip-specific bool?  Though  
eventually this
>> > shouldn't be needed at all -- instead the check would be "does  
the
>> > requested irqchip fd exist and refer to something that exposes  
an irqchip
>> > interface"?
>> How would we get the irqchip id?
>
> I was thinking it would come from whatever operation you're trying  
to do, though I see that MSI routing doesn't specify an irqchip. :-P
>
> In that case I guess it would check for whether any MSI handler has  
been registered.

I think you're over engineering here :). These are internal  
interfaces that whoever wants to implement a secondary irqchip can  
worry about. I merely wanted to make sure we don't block our road for  
multiple irqchips in the kernel<->user interface from the beginning.  
Internal ones are a different story :).

Well, I did say "eventually".  My more immediate reaction was to seeing  
"MPIC" in a place it doesn't need to be, hence the question about a  
simple bool.

> Seems a bit heavyweight to add several new MPIC-specific routing  
types -- maybe just one new KVM_IRQ_ROUTING type that lets multiple  
words be used to describe the interrupt?

Well, we can add a single KVM_IRQ_ROUTING_MPIC type if that's better.  
But we don't have to do it now. I dislike code that we can't test,  
and I don't have a good test case for user space injected IPIs right  
now :).

Sure, it doesn't need to be now, as long as it's clear we have the  
extensibility to do it if/when the need arises.  It likely will arise  
at least for error interrupts.

I'd rather see it not have MPIC in the name, though.  It's not  
MPIC-specific, just a new version of the struct with additional words  
for "pin".

>> For simple injection we can always do an ioctl on the MPIC device.
>
> I got complaints for that originally. :-)

There are 2 reasons why direct ioctls on the MPIC device could be bad

  1) irqfd
  2) code sharing

I'm not convinced yet we care about performance for MPIC IPI, TMR or  
ERR interrupts. So irqfd is out of the question there.

It's out of the question because you're not yet convinced? :-)

I think I'm done being surprised by what users try to do (though they  
may prove me wrong...).  A different user complained about our non-KVM  
hypervisor because we didn't give guests direct access to the MPIC  
timers and IPIs, even though we provided adequate hypervisor-emulated  
mechanisms -- their application wanted to use hundreds of thousands of  
timers per second.  They would not listen when we suggested that  
perhaps they could rearchitect things to not be as reliant on timers.   
We ended up implementing direct MPIC timer and IPI access there as well.

As for error interrupts, no, they're not performance-critical, but we  
still probably want to use irqfd if we're using it for everything  
else.  Even if the code for userspace reflection is already there, do  
you really want errors to be the one thing using a special IRQ path  
that's rarely tested?  Beyond the extent to which it's necessary  
because of the specialness of the error interrupts in the hardware  
design.

Code sharing only makes sense in areas where things are common. In  
case of the MPIC, this is for SRC interrupts.

So I don't think there should be complaints here :).

If you ignore irqfd (which was covered in #1), there isn't much that is  
really common even for "SRC" interrupts (and what is common is  
artificially so), and I made that point, but still got complaints.  I  
was the one to point out that irqfd was the real reason to need to use  
the routing interface (and thus KVM_IRQ_LINE); they were more hung up  
on "why are you being different?".

>> However, I'd be inclined to say that it's rather unlikely you'd  
want to have a vfio device's interrupt line hooked up to the IPI  
interrupt of your guest...
>
> Well, as I said, we've done it before for a customer (not with VFIO  
of course, but our old internal-tree device assignment mechanism), so  
not *that* unlikely.  The host was a two-core chip running Linux on  
only one of the cores, and a custom OS on the other core.  They  
wanted to communicate between the custom OS and a KVM guest.  Since  
host Linux was only running on one core, it didn't need the IPIs for  
itself (and beginning with e500mc, the host uses msgsnd instead, so  
there also the host will not need the IPIs for itself).

They could just as well use a guest SRC line for that, no? What the  
listener on the host connects to on the host's MPIC is pretty much  
orthogonal to what we inject into the guest.

What, and make more modifications to their custom OS code? :-)

-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html