Am 13.06.2017 um 08:29 schrieb axie: > On 2017-06-12 07:00 PM, Christian König wrote: >> Am 12.06.2017 um 22:31 schrieb Alex Xie: >>> This patch is to move a loop of unref BOs and >>> several memory free function calls out of >>> critical sections. >>> >>> Signed-off-by: Alex Xie <AlexBin.Xie at amd.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++- >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c >>> index a664987..02c138f 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c >>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct >>> amdgpu_fpriv *fpriv, int id) >>> /* Another user may have a reference to this list still */ >>> mutex_lock(&list->lock); >>> mutex_unlock(&list->lock); >>> + mutex_unlock(&fpriv->bo_list_lock); >>> amdgpu_bo_list_free(list); >>> } >>> - mutex_unlock(&fpriv->bo_list_lock); >>> + else { >>> + mutex_unlock(&fpriv->bo_list_lock); >>> + } >> >> You could move the unlock of bo_list_lock even before the if. >> >> But since you pointed it out there is quite a bug in this function: >>> /* Another user may have a reference to this list >>> still */ >>> mutex_lock(&list->lock); >>> mutex_unlock(&list->lock); >> Not sure if that is still up to date, but that use case used to be >> illegal with mutexes. > As I understand this piece of code, these mutex_lock and mutex_unlock > pair are used to make sure all other tasks have finished > access of the bo list. Another side of this story is in function > amdgpu_bo_list_get. These two piece of codes together make sure > we can safely destroy bo list. Yeah, the idea behind the code is correct. But using mutexes in that way is illegal, see here https://lwn.net/Articles/575460/. I'm not sure if that is still up to date, but in ancient times you needed to avoid patterns like this: mutex_unlock(&obj->lock); kfree(obj); Anyway I suggest we just replace the whole bo_list handling with and RCU and refcount based implementation. That should avoid the whole locking for the read only code path. Regards, Christian. > > Otherwise we can easily simplify these lockings. > Let me give an example here. > > In function amdgpu_bo_list_get, assuming we change the code like this: > ... > down_read(&fpriv->bo_list_lock); > result = idr_find(&fpriv->bo_list_handles, id); > up_read(&fpriv->bo_list_lock); > /**Line 1. Task A was scheduled away from CPU**/ > if (result) > mutex_lock(&result->lock); > ... > > In function amdgpu_bo_list_destroy, assuming we change the code like > this: > ... > down_write(&fpriv->bo_list_lock); > list = idr_remove(&fpriv->bo_list_handles, id); > up_write(&fpriv->bo_list_lock); > if (list) { > /* Another user may have a reference to this list still */ > mutex_lock(&list->lock); > mutex_unlock(&list->lock); > amdgpu_bo_list_free(list); > } > } > > When task A is running in function amdgpu_bo_list_get in line 1, CPU > scheduler takes CPU away from task A. > Then Task B run function amdgpu_bo_list_destroy. Task B can run all > the way to destroy and free mutex. > Later Task A is back to run. The mutex result->lock has been destroyed > by task B. Now task A try to lock a mutex > which has been destroyed and freed. > > >> >> Christian. >> >>> } >>> static int amdgpu_bo_list_set(struct amdgpu_device *adev, >> >> > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx