Hi Jason, > From: Liu, Yi L <yi.l.liu@xxxxxxxxx> > Sent: Thursday, April 1, 2021 12:39 PM > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > Sent: Wednesday, March 31, 2021 8:41 PM > > > > On Wed, Mar 31, 2021 at 07:38:36AM +0000, Liu, Yi L wrote: > > > > > The reason is /dev/ioasid FD is per-VM since the ioasid allocated to > > > the VM should be able to be shared by all assigned device for the VM. > > > But the SVA operations (bind/unbind page table, cache_invalidate) > should > > > be per-device. > > > > It is not *per-device* it is *per-ioasid* > > > > And as /dev/ioasid is an interface for controlling multiple ioasid's > > there is no issue to also multiplex the page table manipulation for > > multiple ioasids as well. > > > > What you should do next is sketch out in some RFC the exactl ioctls > > each FD would have and show how the parts I outlined would work and > > point out any remaining gaps. > > > > The device FD is something like the vfio_device FD from VFIO, it has > > *nothing* to do with PASID beyond having a single ioctl to authorize > > the device to use the PASID. All control of the PASID is in > > /dev/ioasid. > > good to see this reply. Your idea is much clearer to me now. If I'm getting > you correctly. I think the skeleton is something like below: > f > 1) userspace opens a /dev/ioasid, meanwhile there will be an ioasid > allocated and a per-ioasid context which can be used to do bind page > table and cache invalidate, an ioasid FD returned to userspace. > 2) userspace passes the ioasid FD to VFIO, let it associated with a device > FD (like vfio_device FD). > 3) userspace binds page table on the ioasid FD with the page table info. > 4) userspace unbinds the page table on the ioasid FD > 5) userspace de-associates the ioasid FD and device FD > > Does above suit your outline? > > If yes, I still have below concern and wish to see your opinion. > - the ioasid FD and device association will happen at runtime instead of > just happen in the setup phase. > - how about AMD and ARM's vSVA support? Their PASID allocation and page > table > happens within guest. They only need to bind the guest PASID table to > host. > Above model seems unable to fit them. (Jean, Eric, Jacob please feel free > to correct me) > - this per-ioasid SVA operations is not aligned with the native SVA usage > model. Native SVA bind is per-device. After reading your reply in https://lore.kernel.org/linux-iommu/20210331123801.GD1463678@xxxxxxxxxx/#t So you mean /dev/ioasid FD is per-VM instead of per-ioasid, so above skeleton doesn't suit your idea. I draft below skeleton to see if our mind is the same. But I still believe there is an open on how to fit ARM and AMD's vSVA support in this the per-ioasid SVA operation model. thoughts? +-----------------------------+-----------------------------------------------+ | userspace | kernel space | +-----------------------------+-----------------------------------------------+ | ioasid_fd = | /dev/ioasid does below: | | open("/dev/ioasid", O_RDWR);| struct ioasid_fd_ctx { | | | struct list_head ioasid_list; | | | ... | | | } ifd_ctx; // ifd_ctx is per ioasid_fd | +-----------------------------+-----------------------------------------------+ | ioctl(ioasid_fd, | /dev/ioasid does below: | | ALLOC, &ioasid); | struct ioasid_data { | | | ioasid_t ioasid; | | | struct list_head device_list; | | | struct list_head next; | | | ... | | | } id_data; // id_data is per ioasid | | | | | | list_add(&id_data.next, | | | &ifd_ctx.ioasid_list); | +-----------------------------+-----------------------------------------------+ | ioctl(device_fd, | VFIO does below: | | DEVICE_ALLOW_IOASID, | 1) get ioasid_fd, check if ioasid_fd is valid | | ioasid_fd, | 2) check if ioasid is allocated from ioasid_fd| | ioasid); | 3) register device/domain info to /dev/ioasid | | | tracked in id_data.device_list | | | 4) record the ioasid in VFIO's per-device | | | ioasid list for future security check | +-----------------------------+-----------------------------------------------+ | ioctl(ioasid_fd, | /dev/ioasid does below: | | BIND_PGTBL, | 1) find ioasid's id_data | | pgtbl_data, | 2) loop the id_data.device_list and tell iommu| | ioasid); | give ioasid access to the devices | +-----------------------------+-----------------------------------------------+ | ioctl(ioasid_fd, | /dev/ioasid does below: | | UNBIND_PGTBL, | 1) find ioasid's id_data | | ioasid); | 2) loop the id_data.device_list and tell iommu| | | clear ioasid access to the devices | +-----------------------------+-----------------------------------------------+ | ioctl(device_fd, | VFIO does below: | | DEVICE_DISALLOW_IOASID,| 1) check if ioasid is associated in VFIO's | | ioasid_fd, | device ioasid list. | | ioasid); | 2) unregister device/domain info from | | | /dev/ioasid, clear in id_data.device_list | +-----------------------------+-----------------------------------------------+ | ioctl(ioasid_fd, | /dev/ioasid does below: | | FREE, ioasid); | list_del(&id_data.next); | +-----------------------------+-----------------------------------------------+ Regards, Yi Liu