On Tue, Dec 06, 2022 at 07:26:04AM +0100, Christoph Hellwig wrote: > all here). In Linux the equivalent would be to implement a mdev driver > that allows passing through the I/O qeues to a guest, but it might Definately not - "mdev" drivers should be avoided as much as possible. In this case Intel has a real PCI SRIOV VF to expose to the guest, with a full VF RID. The proper VFIO abstraction is the variant PCI driver as this series does. We want to use the variant PCI drivers because they properly encapsulate all the PCI behaviors (MSI, config space, regions, reset, etc) without requiring re-implementation of this in mdev drivers. mdev drivers should only be considered if a real PCI VF is not available - eg because the device is doing "SIOV" or something. We have several migration drivers in VFIO now following this general pattern, from what I can see they have done it broadly properly from a VFIO perspective. > be a better idea to handle the device model emulation entirely in > Qemu (or other userspace device models) and just find a way to expose > enough of the I/O queues to userspace. This is much closer to the VDPA model which is basically providing a some kernel support to access the IO queue and a lot of SW in qemu to generate the PCI device in the VM. The approach has positives and negatives, we have done both in mlx5 devices and we have a preference toward the VFIO model. VPDA specifically is very big and complicated compared to the VFIO approach. Overall having fully functional PCI SRIOV VF's available lets more uses cases work than just "qemu to create a VM". qemu can always build a VDPA like thing by using VFIO and VFIO live migration to shift control of the device between qemu and HW. I don't think we know enough about this space at the moment to fix a specification to one path or the other, so I hope the TPAR will settle on something that can support both models in SW and people can try things out. Jason