Re: Possible lock inversion in ttm_bo_vm_access

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Christian,

On 10/13/2018 07:36 PM, Christian König wrote:
Hi Thomas,

bo_reserve()
copy_to_user() / copy_from_user()
bo_unreserve()

That pattern is illegal for a number of reasons and the mmap_sem is only one of it.

So the locking order must always be mmap_sem->bo_reservation. See the userptr implementation in amdgpu as well.

Christian.

I'm not arguing against that, and since vmwgfx doesn't use that pattern, the locking order doesn't really matter to me since it's even possible to make the TTM fault() handler more well-behaved if we were to fix the locking order to mmap_sem->bo_reserve.

My concern is, since the _opposite_ locking order is (admittedly vaguely) documented in the fault handler, that the access() handler took a shortcut when the new locking order was  established possibly without auditing of the other TTM drivers for locking inversion: For example it looks from a quick glance like nouveau_gem_pushbuf_reloc_apply() calls copy_from_user() with bo's reserved (which IIRC was the typical use-case at the time this was last lifted). And lockdep won't trip unless the access() callback is actually called.

My point is if AMD wants to enforce this locking order, then IMHO the other drivers need to be audited and corrected if they are assuming the locking order documented in fault(). A good way to catch such drivers would be to add that might_lock().

Thanks,
Thomas



Am 12.10.2018 um 16:52 schrieb Thomas Hellstrom:
Hi, Felix,

It looks like there is a locking inversion in ttm_bo_vm_access() where we take a sleeping bo_reserve() while holding mmap_sem().

Previously we've been assuming the other way around or at least undefined allowing for drivers to do

bo_reserve()
copy_to_user() / copy_from_user()
bo_unreserve()

I'm not sure the latter pattern is used in any drivers, though, and I guess there are ways around it. So it might make sense to fix the locking order at this point. In that case, perhaps one should add a

might_lock(&bo->resv->lock.base);

at the start of the TTM fault handler to trip lockdep on locking order violations in situations where the access() callback isn't commonly used...

/Thomas




_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel


_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux