On 4/21/2020 4:54 PM, Jason Gunthorpe wrote:
On Tue, Apr 21, 2020 at 04:33:46PM -0700, Dave Jiang wrote:
The actual code is independent of the stage 2 driver code submission that adds
support for SVM, ENQCMD(S), PASID, and shared workqueues. This code series will
support dedicated workqueue on a guest with no vIOMMU.
A new device type "mdev" is introduced for the idxd driver. This allows the wq
to be dedicated to the usage of a VFIO mediated device (mdev). Once the work
queue (wq) is enabled, an uuid generated by the user can be added to the wq
through the uuid sysfs attribute for the wq. After the association, a mdev can
be created using this UUID. The mdev driver code will associate the uuid and
setup the mdev on the driver side. When the create operation is successful, the
uuid can be passed to qemu. When the guest boots up, it should discover a DSA
device when doing PCI discovery.
I'm feeling really skeptical that adding all this PCI config space and
MMIO BAR emulation to the kernel just to cram this into a VFIO
interface is a good idea, that kind of stuff is much safer in
userspace.
Particularly since vfio is not really needed once a driver is using
the PASID stuff. We already have general code for drivers to use to
attach a PASID to a mm_struct - and using vfio while disabling all the
DMA/iommu config really seems like an abuse.
A /dev/idxd char dev that mmaps a bar page and links it to a PASID
seems a lot simpler and saner kernel wise.
The mdev utilizes Interrupt Message Store or IMS[3] instead of MSIX for
interrupts for the guest. This preserves MSIX for host usages and also allows a
significantly larger number of interrupt vectors for guest usage.
I never did get a reply to my earlier remarks on the IMS patches.
The concept of a device specific addr/data table format for MSI is not
Intel specific. This should be general code. We have a device that can
use this kind of kernel capability today.
<resending to the mailing list, I had incorrect email options set>
Hi Jason,
I am sorry if I did not address your comments earlier.
The present IMS code is quite generic, most of the code is in the
drivers/ folder. We basically introduce 2 APIS: allocate and free IMS
interrupts and a IMS IRQ domain to allocate these interrupts from. These
APIs are architecture agnostic.
We also introduce a new IMS IRQ domain which is architecture specific.
This is because IMS generates interrupts only in the remappable format,
hence interrupt remapping should be enabled for IMS. Currently, the
interrupt remapping code is only available for Intel and AMD and I don’t
see anything for ARM.
If a new architecture would want to use IMS, they must simply introduce
a new IMS IRQ domain. I am not sure if there is any other way around
this. If you have any ideas, please let me know.
Also, could you give more details on the device that could use IMS? Do
you have some driver code already? We could then see if and how the
current IMS code could be made more generic.
Jason