> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Wednesday, May 12, 2021 1:37 AM > > On Tue, May 11, 2021 at 02:56:05PM +0800, Lu Baolu wrote: > > > > After my next series the mdev drivers will have direct access to > > > the vfio_device. So an alternative to using the struct device, or > > > adding 'if mdev' is to add an API to the vfio_device world to > > > inject what iommu configuration is needed from that direction > > > instead of trying to discover it from a struct device. > > > > Just want to make sure that I understand you correctly. > > > > We should use the existing IOMMU in-kernel APIs to connect mdev with the > > iommu subsystem, so that the upper lays don't need to use something > > like (if dev_is_mdev) to handle mdev differently. Do I get you > > correctly? > > After going through all the /dev/ioasid stuff I'm pretty convinced > that none of the PASID use cases for mdev should need any iommu > connection from the mdev_device - this is an artifact of trying to > cram the vfio container and group model into the mdev world and is not > good design. > > The PASID interfaces for /dev/ioasid should use the 'struct > pci_device' for everything and never pass in a mdev_device to the > iommu layer. 'struct pci_device' -> 'struct device' since /dev/ioasid also needs to support non-pci devices? > > /dev/ioasid should be designed to support this operation and is why I > strongly want to see the actual vfio_device implementation handle the > connection to the iommu layer and not keep trying to hack through > building what is actually a vfio_device specific connection through > the type1 container code. > I assume the so-called connection here implies using iommu_attach_device to attach a device to an iommu domain. Did you suggest this connection must be done by the mdev driver which implements vfio_device and then passing iommu domain to /dev/ioasid when attaching the device to an IOASID? sort of like: ioctl(device_fd, VFIO_ATTACH_IOASID, ioasid, domain); If yes, this conflicts with one design in /dev/ioasid proposal that we're working on. In earlier discussion we agreed that each ioasid is associated to a singleton iommu domain and all devices that are attached to this ioasid with compatible iommu capabilities just share this domain. It implies that iommu domain is allocated at ATTACH_IOASID phase (e.g. when the 1st device is attached to an ioasid). Pre-allocating domain by vfio_device driver means that every device (SR-IOV or mdev) has its own domain thus cannot share ioasid then. Did I misunderstand your intention? Baolu and I discussed below draft proposal to avoid passing mdev_device to the iommu layer. Please check whether it makes sense: // for every device attached to an ioasid // mdev is represented by pasid (allocated by mdev driver) // pf/vf has INVALID_IOASID in pasid struct dev_info { struct list_head next; struct device *device; u32 pasid; } // for every allocated ioasid struct ioasid_info { // the handle to convey iommu operations struct iommu_domain *domain; // metadata for map/unmap struct rb_node dma_list; // the list of attached device struct dev_info *dev_list; ... } // called by VFIO/VDPA int ioasid_attach_device(struct *device, u32 ioasid, u32 pasid) { // allocate a new dev_info, filled with device/pasid // allocate iommu domain if it's the 1st attached device // check iommu compatibility if an domain already exists // attach the device to the iommu domain if (pasid == INVALID_IOASID) iommu_attach_device(domain, device); else iommu_aux_attach_device(domain, device, pasid); // add dev_info to the dev_list of ioasid_info } // when attaching PF/VF to an ioasid ioctl(device_fd, VFIO_ATTACH_IOASID, ioasid); -> get vfio_device of device_fd -> ioasid_attach_device(vfio_device->dev, ioasid, INVALID_IOASID); // when attaching a mdev to an ioasid ioctl(device_fd, VFIO_ATTACH_IOASID, ioasid); -> get vfio_device of device_fd -> find mdev_parent of vfio_device -> find pasid allocated to this mdev -> ioasid_attach_device(parent->dev, ioasid, pasid); starting from this point the vfio device has been connected to the iommu layer. /dev/ioasid can accept pgtable cmd on this ioasid and associated domain. Thanks Kevin