> From: Nicolin Chen <nicolinc@xxxxxxxxxx> > Sent: Wednesday, February 15, 2023 9:59 AM > > On Wed, Feb 15, 2023 at 01:37:19AM +0000, Tian, Kevin wrote: > > > > From: Nicolin Chen <nicolinc@xxxxxxxxxx> > > > Sent: Tuesday, February 14, 2023 6:54 PM > > > > > > On Mon, Feb 13, 2023 at 10:49:34AM -0400, Jason Gunthorpe wrote: > > > > On Sun, Feb 12, 2023 at 11:48:30PM -0800, Nicolin Chen wrote: > > > > > > > > > What about point 1? If dev2 and dev3 are already replaced when > > > > > doing iommu_group_replace_domain() on dev1, their idev objects > > > > > still have the old hwpt/iopt until user space does another two > > > > > IOCTLs on them, right? > > > > > > > > We have a complicated model for multi-device groups... > > > > > > > > The first device in the group to change domains must move all the > > > > devices in the group > > > > > > > > But userspace is still expected to run through and change all the > > > > other devices > > > > > > > > So replace should be a NOP if the group is already linked to the right > > > > domain. > > > > > > > > This patch needs to make sure that incosistency in the view betwen the > > > > iommu_group and the iommufd model doesn't cause a functional > > > > problem. > > > > > > Yea, I was thinking that we'd need to block any access to the > > > idev->hwpt of a pending device's, before the kernel finishes > > > the "NOP" IOCTL from userspace, maybe with a helper: > > > (iommu_get_domain_for_dev(idev->dev) != idev->hwpt->domain) > > > > > > > I didn't see what would be broken w/o such blocking measure. > > > > Can you elaborate? > > iommu_group { idev1, idev2 } > > (1) Attach all devices to domain1 { > group->domain = domain1; > idev1->hwpt = hwpt1; // domain1 > idev2->hwpt = hwpt1; // domain1 > } > > (2) Attach (replace) idev1 only to domain2 { > group->domain = domain2 > idev1->hwpt = hwpt2; // domain2 > idev2->hwpt == domain1 // != iommu_get_domain_for_dev() > } > > Then if user space isn't aware of these and continues to do > IOMMU_IOAS_MAP for idev2. Though IOVA mappings may be added > onto the domain2 correctly, yet domain2's iopt itree won't > reflect that, until idev2->hwpt is updated too, right? IOMMU_IOAS_MAP is for ioas instead of for device. even w/o any device attached we still allow ioas map. also note in case of domain1 still actively attached to idev3 in another group, banning IOAS_MAP due to idev2 in transition is also incorrect for idev3. > > And the tricky thing is that, though we advocate a device- > centric uAPI, we'd still have to ask user space to align the > devices in the same iommu_group, which is also not present > in QEMU's RFC v3 series. IMHO this requirement has been stated clearly from the start when designing this device centric interface. QEMU should be improved to take care of it.