On Mon, Feb 06, 2023 at 06:57:35AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > Sent: Friday, February 3, 2023 11:03 PM > > > > On Fri, Feb 03, 2023 at 08:26:44AM +0000, Tian, Kevin wrote: > > > > From: Nicolin Chen <nicolinc@xxxxxxxxxx> > > > > Sent: Thursday, February 2, 2023 3:05 PM > > > > > > > > All drivers are already required to support changing between active > > > > UNMANAGED domains when using their attach_dev ops. > > > > > > All drivers which don't have *broken* UNMANAGED domain? > > > > No, all drivers.. It has always been used by VFIO. > > existing iommu_attach_group() doesn't support changing between > two UNMANAGED domains. only from default->unmanaged or > blocking->unmanaged. Yes, but before we added the blocking domains VFIO was changing between unmanaged domains. Blocking domains are so new that no driver could have suddenly started to depend on this. > > > Can you elaborate the error handling here? Ideally if > > > __iommu_group_set_domain() fails then group->domain shouldn't > > > be changed. > > > > That isn't what it implements though. The internal helper leaves > > things in a mess, it is for the caller to fix it, and it depends on > > the caller what that means. > > I didn't see any warning of the mess and the caller's responsibility > in __iommu_group_set_domain(). Can it be documented clearly > so if someone wants to add a new caller on it he can clearly know > what to do? That would be nice.. > and why doesn't iommu_attach_group() need to do anything > when an attach attempt fails? In the end it calls the same > iommu_group_do_attach_device() as __iommu_group_set_domain() > does. That's a bug for sure. > btw looking at the code __iommu_group_set_domain(): > > * Note that this is called in error unwind paths, attaching to a > * domain that has already been attached cannot fail. > */ > ret = __iommu_group_for_each_dev(group, new_domain, > iommu_group_do_attach_device); > > with that we don't need fall back to core domain in above error > unwinding per this comment. That does make some sense. I tried to make a patch to consolidate all this error handling once, that would be the better way to approach this. > > In this case the API cannot retain a hidden reference to the new > > domain, so it must be purged, one way or another. > > Could you elaborate where the hidden reference is retained? Inside the driver, it can keep track of the domain pointer if attach_dev succeeds Jason