That patch won't work correctly like this. When the lock is dropped it is possible that the BO is removed from the ddelete list and ttm_bo_cleanup_refs() starts to wait for the wrong reservation object. I think we can remove the wait for bo->resv now and always wait for bo->ttm_resv, but I'm not 100% sure. Need to double check the code as well, Christian. Am 23.01.2018 um 20:25 schrieb Tom St Denis: > On 22/01/18 01:42 AM, Chunming Zhou wrote: >> >> >> On 2018å¹´01æ??20æ?¥ 02:23, Tom St Denis wrote: >>> On 19/01/18 01:14 PM, Tom St Denis wrote: >>>> Hi all, >>>> >>>> In the function ttm_bo_cleanup_refs() it seems possible to get to >>>> line 551 without entering the block on 516 which means you'll be >>>> unlocking a mutex that wasn't locked. >>>> >>>> Now it might be that in the course of the API this pattern cannot >>>> be expressed but it's not clear from the function alone that that >>>> is the case. >>> >>> >>> Looking further it seems the behaviour depends on locking in parent >>> callers. That's kinda a no-no right? Shouldn't the lock be >>> taken/released in the same function ideally? >> Same feelings >> >> Regards, >> David Zhou > > Attached is a patch that addresses this. > > I can't see any obvious race in functions that call > ttm_bo_cleanup_refs() between the time they let go of the lock and the > time it's taken again in the call. > > Running it on my system doesn't produce anything notable though the > KASAN with DRI_PRIME=1 issue is still there (this patch neither causes > that nor fixes it). > > Tom