On Wed, Feb 15, 2023 at 01:37:19AM +0000, Tian, Kevin wrote: > > From: Nicolin Chen <nicolinc@xxxxxxxxxx> > > Sent: Tuesday, February 14, 2023 6:54 PM > > > > On Mon, Feb 13, 2023 at 10:49:34AM -0400, Jason Gunthorpe wrote: > > > On Sun, Feb 12, 2023 at 11:48:30PM -0800, Nicolin Chen wrote: > > > > > > > What about point 1? If dev2 and dev3 are already replaced when > > > > doing iommu_group_replace_domain() on dev1, their idev objects > > > > still have the old hwpt/iopt until user space does another two > > > > IOCTLs on them, right? > > > > > > We have a complicated model for multi-device groups... > > > > > > The first device in the group to change domains must move all the > > > devices in the group > > > > > > But userspace is still expected to run through and change all the > > > other devices > > > > > > So replace should be a NOP if the group is already linked to the right > > > domain. > > > > > > This patch needs to make sure that incosistency in the view betwen the > > > iommu_group and the iommufd model doesn't cause a functional > > > problem. > > > > Yea, I was thinking that we'd need to block any access to the > > idev->hwpt of a pending device's, before the kernel finishes > > the "NOP" IOCTL from userspace, maybe with a helper: > > (iommu_get_domain_for_dev(idev->dev) != idev->hwpt->domain) > > > > I didn't see what would be broken w/o such blocking measure. > > Can you elaborate? iommu_group { idev1, idev2 } (1) Attach all devices to domain1 { group->domain = domain1; idev1->hwpt = hwpt1; // domain1 idev2->hwpt = hwpt1; // domain1 } (2) Attach (replace) idev1 only to domain2 { group->domain = domain2 idev1->hwpt = hwpt2; // domain2 idev2->hwpt == domain1 // != iommu_get_domain_for_dev() } Then if user space isn't aware of these and continues to do IOMMU_IOAS_MAP for idev2. Though IOVA mappings may be added onto the domain2 correctly, yet domain2's iopt itree won't reflect that, until idev2->hwpt is updated too, right? And the tricky thing is that, though we advocate a device- centric uAPI, we'd still have to ask user space to align the devices in the same iommu_group, which is also not present in QEMU's RFC v3 series. The traditional detach+attach flow doesn't seem to have this issue, since there's no re-entry so the work flow is always that detaching all devices first before attaching to another domain. Thanks Nic