Am 2021-04-15 um 3:41 a.m. schrieb Christian König: > > > Am 14.04.21 um 17:15 schrieb Felix Kuehling: >> Am 2021-04-14 um 3:33 a.m. schrieb Christian König: >>> Am 14.04.21 um 08:46 schrieb Felix Kuehling: >>>> amdgpu_ttm_tt_unpopulate can be called during bo_destroy. The >>>> dmabuf->resv >>>> must not be held by the caller or dma_buf_detach will deadlock. >>>> This is >>>> probably not the right fix. I get a recursive lock warning with the >>>> reservation held in ttm_bo_release. Should unmap_attachment move to >>>> backend_unbind instead? >>> Yes probably, but I'm really wondering if we should call unpopulate >>> without holding the reservation lock. >> There is an error handling code path in ttm_tt_populate that calls >> unpopulate. > > That should be harmless. For populating the page array we need the > same lock as for unpopulating it. > >> I believe that has to be holding the reservation lock. > > Correct, yes. > >> The other cases (destroy and swapout) do not hold the lock, AIUI. > > That's not correct. See ttm_bo_release() for example: > > ... > if (!dma_resv_test_signaled_rcu(bo->base.resv, true) || > !dma_resv_trylock(bo->base.resv)) { > ... > > We intentionally lock the reservation object here or put it on the > delayed delete list because dropping the tt object without holding the > lock is illegal for multiple reasons. I think this is because I manually individualized the reservation in patch 4. Without that I was running into different problems (probably need to dig a bit more to understand what's happening there). So the lock held by release is not the same as the lock of the original dmabuf. Regards, Felix > > If you run into an unpopulate which doesn't hold the lock then I > really need that backtrace because we are running into a much larger > bug here. > > Thanks, > Christian. > > >> >> Regards, >> Felix >> >> >>> Christian. >>> >>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@xxxxxxx> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 13 +++++++++++++ >>>> 1 file changed, 13 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >>>> index 936b3cfdde55..257750921eed 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c >>>> @@ -1216,9 +1216,22 @@ static void amdgpu_ttm_tt_unpopulate(struct >>>> ttm_device *bdev, >>>> if (ttm->sg && gtt->gobj->import_attach) { >>>> struct dma_buf_attachment *attach; >>>> + bool locked; >>>> attach = gtt->gobj->import_attach; >>>> + /* FIXME: unpopulate can be called during bo_destroy. >>>> + * The dmabuf->resv must not be held by the caller or >>>> + * dma_buf_detach will deadlock. This is probably not >>>> + * the right fix. I get a recursive lock warning with the >>>> + * reservation held in ttm_bo_releas.. Should >>>> + * unmap_attachment move to backend_unbind instead? >>>> + */ >>>> + locked = dma_resv_is_locked(attach->dmabuf->resv); >>>> + if (!locked) >>>> + dma_resv_lock(attach->dmabuf->resv, NULL); >>>> dma_buf_unmap_attachment(attach, ttm->sg, >>>> DMA_BIDIRECTIONAL); >>>> + if (!locked) >>>> + dma_resv_unlock(attach->dmabuf->resv); >>>> ttm->sg = NULL; >>>> return; >>>> } > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx