RE: [PATCH v3 03/17] iommufd: Replace the hwpt->devices list with iommufd_group

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Friday, April 14, 2023 9:32 PM
> 
> On Thu, Apr 13, 2023 at 02:52:54AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > > Sent: Wednesday, April 12, 2023 7:18 PM
> > >
> > > On Wed, Apr 12, 2023 at 08:27:36AM +0000, Tian, Kevin wrote:
> > > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > > > > Sent: Tuesday, April 11, 2023 10:31 PM
> > > > >
> > > > > On Thu, Mar 23, 2023 at 07:21:42AM +0000, Tian, Kevin wrote:
> > > > >
> > > > > > If no oversight then we can directly put the lock in
> > > > > > iommufd_hw_pagetable_attach/detach() which can also simplify a
> bit
> > > on
> > > > > > its callers in device.c.
> > > > >
> > > > > So, I did this, and syzkaller explains why this can't be done:
> > > > >
> > > > >
> https://lore.kernel.org/r/0000000000006e66d605f83e09bc@xxxxxxxxxx
> > > > >
> > > > > We can't allow the hwpt to be discovered by a parallel
> > > > > iommufd_hw_pagetable_attach() until it is done being setup,
> otherwise
> > > > > if we fail to set it up we can't destroy the hwpt.
> > > > >
> > > > > 	if (immediate_attach) {
> > > > > 		rc = iommufd_hw_pagetable_attach(hwpt, idev);
> > > > > 		if (rc)
> > > > > 			goto out_abort;
> > > > > 	}
> > > > >
> > > > > 	rc = iopt_table_add_domain(&hwpt->ioas->iopt, hwpt->domain);
> > > > > 	if (rc)
> > > > > 		goto out_detach;
> > > > > 	list_add_tail(&hwpt->hwpt_item, &hwpt->ioas->hwpt_list);
> > > > > 	return hwpt;
> > > > >
> > > > > out_detach:
> > > > > 	if (immediate_attach)
> > > > > 		iommufd_hw_pagetable_detach(idev);
> > > > > out_abort:
> > > > > 	iommufd_object_abort_and_destroy(ictx, &hwpt->obj);
> > > > >
> > > > > As some other idev could be pointing at it too now.
> > > >
> > > > How could this happen before this object is finalized? iirc you pointed to
> > > > me this fact in previous discussion.
> > >
> > > It only is unavailable through the xarray, but we've added it to at
> > > least one internal list on the group already, it is kind of sketchy to
> > > work like this, it should all be atomic..
> > >
> >
> > which internal list? group has a list for attached devices but regarding
> > to hwpt it's stored in a single field igroup->hwpt.
> 
> It is added to
> 
> 	list_add_tail(&hwpt->hwpt_item, &hwpt->ioas->hwpt_list);

this is called under ioas->mutex.

> 
> Which can be observed from
> 
> 	mutex_lock(&ioas->mutex);
> 	list_for_each_entry(hwpt, &ioas->hwpt_list, hwpt_item) {
> 		if (!hwpt->auto_domain)
> 			continue;
> 
> 		if (!iommufd_lock_obj(&hwpt->obj))
> 			continue;

this is called also under ioas->mutex. So no race. 😊

> 
> If iommufd_lock_obj() has happened then
> iommufd_object_abort_and_destroy() is in trouble.
> 
> Thus we need to hold the ioas->mutex right up until we know we can't
> call iommufd_object_abort_and_destroy(), or lift out the hwpt list_add
> 
> This could maybe also be fixed by holding the destroy_rw_sem right up
> until finalize. Though, I think I looked at this once and decided
> against it for some reason..
> 
> > btw removing this lock in this file also makes it easier to support siov
> > device which doesn't have group. We can have internal group attach
> > and pasid attach wrappers within device.c and leave igroup->lock held
> > in the group attach path.
> 
> Yeah, I expect this will need more work when we get to PASID support
> 
> Most likely the resolution will be something like PASID domains can't
> be used as PF/VF domains because they don't have the right reserved
> regions, so they shouldn't be in the hwpt_list at all, so we can use a
> more relaxed locking.
> 

Yes with pasid there is no reserved region conceptually. Currently in
our internal implementation we still added the reserved regions of
the device to IOAS upon pasid attach which looks unnecessary.

But I didn't get why a domain cannot be shared by PASID and PF/VF.
Upon attach we add the reserved regions (if any) of the device to
IOAS. I don't think there is a requirement that reserved regions
must exist for a successful attachment?

A similar example is on Jacob's DMA API PASID work. The DMA
domain is first attached to RID and then to a PASID.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux