Re: [PATCH RFC v2 00/18] Add VFIO mediated device support and DEV-MSI support for the idxd driver

"Dey, Megha" <megha.dey@xxxxxxxxx> · Wed, 22 Jul 2020 10:31:28 -0700

On 7/21/2020 11:00 AM, Dave Jiang wrote:

On 7/21/2020 9:45 AM, Jason Gunthorpe wrote:
On Tue, Jul 21, 2020 at 09:02:15AM -0700, Dave Jiang wrote:
v2:
IMS (now dev-msi):
With recommendations from Jason/Thomas/Dan on making IMS more generic:
Pass a non-pci generic device(struct device) for IMS management 
instead of mdev
Remove all references to mdev and symbol_get/put
Remove all references to IMS in common code and replace with dev-msi
remove dynamic allocation of platform-msi interrupts: no groups,no 
new msi list or list helpers
Create a generic dev-msi domain with and without interrupt remapping 
enabled.
Introduce dev_msi_domain_alloc_irqs and dev_msi_domain_free_irqs apis

I didn't dig into the details of irq handling to really check this,
but the big picture of this is much more in line with what I would
expect for this kind of ability.

Link to previous discussions with Jason:
https://lore.kernel.org/lkml/57296ad1-20fe-caf2-b83f-46d823ca0b5f@xxxxxxxxx/ 

The emulation part that can be moved to user space is very small due 
to the majority of the
emulations being control bits and need to reside in the kernel. We 
can revisit the necessity of
moving the small emulation part to userspace and required 
architectural changes at a later time.

The point here is that you already have a user space interface for
these queues that already has kernel support to twiddle the control
bits. Generally I'd expect extending that existing kernel code to do
the small bit more needed for mapping the queue through to PCI
emulation to be smaller than the 2kloc of new code here to put all the
emulation and support framework in the kernel, and exposes a lower
attack surface of kernel code to the guest.

The kernel can specify the requirements for these callback functions
(e.g., the driver is not expected to block, or not expected to take
a lock in the callback function).

I didn't notice any of this in the patch series? What is the calling
context for the platform_msi_ops ? I think I already mentioned that
ideally we'd need blocking/sleeping. The big selling point is that IMS
allows this data to move off-chip, which means accessing it is no
longer just an atomic write to some on-chip memory.

These details should be documented in the comment on top of
platform_msi_ops

so the platform_msi_ops care called from the same context as the 
existing msi_ops for instance, we are not adding anything new. I think 
the above comment is a little misleading I will remove it next time around.

Also, I thought even the current write to on-chip memory is not atomic.. 
could you let me know which piece of code you are referring to?
Since the driver gets to write to the off chip memory, shouldn't it be 
the drivers responsibility to call it from a sleeping/blocking context?

I'm actually a little confused how idxd_ims_irq_mask() manages this -
I thought IRQ masking should be synchronous, shouldn't there at least 
be a
flushing read to ensure that new MSI's are stopped and any in flight
are flushed to the APIC?

You are right Jason. It's missing a flushing read.

Jason

.