On Wed, Apr 21, 2021 at 10:54:51AM -0600, Alex Williamson wrote: > That's essentially replacing vfio-core, where I think we're more I am only talking about /dev/vfio here which is basically the IOMMU interface part. I still expect that VFIO_GROUP_SET_CONTAINER will be used to connect /dev/{ioasid,vfio} to the VFIO group and all the group and device logic stays inside VFIO. The appeal of unifying /dev/{ioasid,vfio} to a single fops is that it cleans up vfio a lot - we don't have to have two different code paths where one handles a vfio_container and the other a ioasid_container and the all the related different iommu ops and so on. Everything can be switched to ioasid_container all down the line. If it wasn't for PPC this looks fairly simple. Since getting rid of PPC looks a bit hard, we'd be stuck with accepting a /dev/ioasid and then immediately wrappering it in a vfio_container an shimming it through a vfio_iommu_ops. It is not ideal at all, but in my look around I don't see a major problem if type1 implementation is moved to live under /dev/ioasid. For concreteness if we look at the set container flow with ioasid I'd say something like: vfio_group_fops_unl_ioctl() VFIO_GROUP_SET_CONTAINER vfio_group_set_container() if (f.file->f_op == &vfio_fops) { // Use a real vfio_container and vfio_iommu_driver driver->ops->attach_group() tce_iommu_attach_group() } if (ioasid_container = ioasid_get_from_fd(container_fd)) { // create a dummy vfio_container and use the ioasid driver container = kzalloc() container->iommu_driver = ioasid_shim driver->ops->attach_group() ioasid_shim_attach_group(ioasid_container, ...) ioasid_attach_group() // What used to be vfio_iommu_attach_group() Broadly all the ops vfio need go through the ioasid_shim which relays them to the generic ioasid API. We end up with a ioasid.h that basically has the vfio_iommu_type1 code lightly recast into some 'struct iommu_container' and a set of ioasid_* function entry points that follow vfio_iommu_driver_ops_type1: ioasid_attach_group ioasid_detatch_group ioasid_<something about user pages> ioasid_read/ioasid_write If we have this, and /dev/ioasid implements the legacy IOCTLs, then /dev/vfio == /dev/ioasid and we can compile out vfio_fops and related from vfio.c and tell ioasid.c to create /dev/vfio instead using the ops it owns. This is a very long winded way of saying ideally we'd do approximately: git mv drivers/vfio/vfio_iommu_type1.c drivers/ioasid/ioasid.c As the first step. Essentially we declare that what is type1 is really the user interface to the internal kernel IOMMU kAPI, which has been steadily evolving since type1 was created 10 years ago. > The interface of making that selection might change to accept an > external /dev/ioasid file descriptor, of course. Maybe you can > elaborate on how the vfio device and group uAPI live (or not) in > this new scheme were /dev/ioasid is the primary interface. Thanks, They say in vfio. You'd still open a group and you'd still pass in either /dev/vfio or /dev/ioasid to define the container Though, completely as an unrelated aside, I admit to not entirely understanding why the group is the central element of the uAPI. It is weird that the vfio "drivers" all work on the struct vfio_device (at least after my series), and it has a file_operations presence via vfio_device_fops, but instead of struct vfio_device directly having a 'struct device' and cdev to access the FD we get it through a group FD and agroup chardev via VFIO_GROUP_GET_DEVICE_FD If we were to revise this, and I don't see a huge reason to do so, I would put a struct device and cdev in struct vfio_device, attach the vfio_device directly to the ioasid and then forget about the group, at least as uapi, completely. Or at least I don't see where that gets into trouble, but I'm not too familiar with the multi-vfio in a process scenario.. Jason