Hi Christian, Yes, it has already one rcu_barrier, but it has called twice call_rcu, so the one rcu_barrier just could barrier one call_rcu some time. After I added another rcu_barrier, the kernel issue will disappear. Best Wishes, Emily Deng > -----Original Message----- > From: Christian König [mailto:ckoenig.leichtzumerken at gmail.com] > Sent: Thursday, May 17, 2018 7:08 PM > To: Deng, Emily <Emily.Deng at amd.com>; amd-gfx at lists.freedesktop.org > Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini > > Am 17.05.2018 um 12:03 schrieb Emily Deng: > > To free the fence from the amdgpu_fence_slab, need twice call_rcu, to > > avoid the amdgpu_fence_slab_fini call > > kmem_cache_destroy(amdgpu_fence_slab) before > kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after > drm_sched_entity_fini. > > > > The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below: > > 1.drm_sched_entity_fini -> > > drm_sched_entity_cleanup -> > > dma_fence_put(entity->last_scheduled) -> > > drm_sched_fence_release_finished -> > drm_sched_fence_release_scheduled > > -> call_rcu(&fence->finished.rcu, drm_sched_fence_free) > > > > 2.drm_sched_fence_free -> > > dma_fence_put(fence->parent) -> > > amdgpu_fence_release -> > > call_rcu(&f->rcu, amdgpu_fence_free) -> > > kmem_cache_free(amdgpu_fence_slab, fence); > > > > v2:put the barrier before the kmem_cache_destroy > > > > Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df > > Signed-off-by: Emily Deng <Emily.Deng at amd.com> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > index 39ec6b8..42be65b 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > > @@ -69,6 +69,7 @@ int amdgpu_fence_slab_init(void) > > void amdgpu_fence_slab_fini(void) > > { > > rcu_barrier(); > > + rcu_barrier(); > > Well, you should have noted that there is already an rcu_barrier here and > adding another one shouldn't have any additional effect. So your explanation > and the proposed solution doesn't make to much sense. > > I think the problem you run into is rather that the fence is reference counted > and might live longer than the module who created it. > > Complicated issue, one possible solution would be to release > fence->parent earlier in the scheduler fence but that doesn't sound like > a general purpose solution. > > Christian. > > > kmem_cache_destroy(amdgpu_fence_slab); > > } > > /*