In amdgpu_fence_process it shouldn't put the signaled hw fence and set RCU pointer to NULL, instead the more reasonable sequence is just leave the RCU be there untouched, and: 1) Either driver_fini() should put this RCU fence or 2) New hw fence emitting should take the RCU slot and put the old hw fence. mapping between get & put on hw fence is: Get Put fence_emit:fence_init ---> free_job:put job->fence ---> run_job:put job->fence //GPU RESET case fence_emit:rcu_assign_pointer ---> fence_driver_fini:put ring->fence_drv.fences[j] ---> fence_emit:put(old) //after fence_wait(old) run_job:job->fence=fence_get() --> sched_main:put(fence) --> job_recovery:put(fence) //for GPU RESET case sched_main:parent=fence_get() --> sched_fence_free:put(fence->parent) sched_job_recovery:parent=fence_get() --> sched_hw_job_reset:put(s_job->s_fence->parent) // for GPU RESET case Change-Id: I623167c1e3143233f4e1be7a400a9b698b4a1355 Signed-off-by: Monk Liu <Monk.Liu at amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index f43319c..d7374cf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -161,6 +161,8 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f) dma_fence_wait(old, false); } + dma_fence_put(old); + rcu_assign_pointer(*ptr, dma_fence_get(&fence->base)); *f = &fence->base; @@ -246,7 +248,6 @@ void amdgpu_fence_process(struct amdgpu_ring *ring) /* There is always exactly one thread signaling this fence slot */ fence = rcu_dereference_protected(*ptr, 1); - RCU_INIT_POINTER(*ptr, NULL); if (!fence) continue; @@ -257,7 +258,6 @@ void amdgpu_fence_process(struct amdgpu_ring *ring) else BUG(); - dma_fence_put(fence); } while (last_seq != seq); } -- 2.7.4