Re: [PATCH v3 11/18] dmaengine: idxd: ims setup for the vdcm

Dave Jiang <dave.jiang@xxxxxxxxx> · Thu, 8 Oct 2020 17:27:49 -0700

On 10/8/2020 4:32 PM, Jason Gunthorpe wrote:
On Fri, Oct 09, 2020 at 01:17:38AM +0200, Thomas Gleixner wrote:
Dave,

On Thu, Oct 08 2020 at 09:51, Dave Jiang wrote:
On 10/8/2020 12:39 AM, Thomas Gleixner wrote:
On Wed, Oct 07 2020 at 14:54, Dave Jiang wrote:
On 9/30/2020 12:57 PM, Thomas Gleixner wrote:
Aside of that this is fiddling in the IMS storage array behind the irq
chips back without any comment here and a big fat comment about the
shared usage of ims_slot::ctrl in the irq chip driver.

This is to program the pasid fields in the IMS table entry. Was
thinking the pasid fields may be considered device specific so didn't
attempt to add the support to the core code.

Well, the problem is that this is not really irq chip functionality.

But the PASID programming needs to touch the IMS storage which is also
touched by the irq chip.

This might be correct as is, but without a big fat comment explaining
WHY it is safe to do so without any form of serialization this is just
voodoo and unreviewable.

Can you please explain when the PASID is programmed and what the state
of the interrupt is at that point? Is this a one off setup operation or
does this happen dynamically at random points during runtime?

I will put in comments for the function to explain why and when we modify the
pasid field for the IMS entry. Programming of the pasid is done right before we
request irq. And the clearing is done after we free the irq. We will not be
touching the IMS field at runtime. So the touching of the entry should be safe.

Thanks for clarifying that.

Thinking more about it, that very same thing will be needed for any
other IMS device and of course this is not going to end well because
some driver will fiddle with the PASID at the wrong time.

Why? This looks like some quirk of the IDXD HW where it just randomly
put PASID along with the IRQ mask register. Probably because PASID is
not the full 32 bits.

The hardware checks that the PASID in the descriptor matches the PASID in the 
IMS entry, to prevent user-mode software from arbitrarily choosing any interrupt 
vector it wants. User mode software has to request an IMS entry from the kernel 
driver and the driver fills in the PASID in the IMS so that only that process 
can use that IMS entry.

AFAIK the PASID is not tagged on the MemWr TLP triggering the
interrupt, so it really is unrelated to the irq.

I think the ioread to get the PASID is rather ugly, it should pluck
the PASID out of some driver specific data structure with proper
locking, and thus use the sleepable version of the irqchip?

This is really not that different from what I was describing for queue
contexts - the queue context needs to be assigned to the irq # before
it can be used in the irq chip other wise there is no idea where to
write the msg to. Just like pasid here.

Jason