Re: [PATCH 19/25] drm/amdgpu: s/GFP_KERNEL/GFP_ATOMIC in scheduler code

Daniel Vetter <daniel@xxxxxxxx> · Wed, 15 Jul 2020 13:53:16 +0200

On Wed, Jul 15, 2020 at 11:17 AM Christian König
<ckoenig.leichtzumerken@xxxxxxxxx> wrote:
>
> Am 14.07.20 um 16:31 schrieb Daniel Vetter:
> > On Tue, Jul 14, 2020 at 01:40:11PM +0200, Christian König wrote:
> >> Am 14.07.20 um 12:49 schrieb Daniel Vetter:
> >>> On Tue, Jul 07, 2020 at 10:12:23PM +0200, Daniel Vetter wrote:
> >>>> My dma-fence lockdep annotations caught an inversion because we
> >>>> allocate memory where we really shouldn't:
> >>>>
> >>>>    kmem_cache_alloc+0x2b/0x6d0
> >>>>    amdgpu_fence_emit+0x30/0x330 [amdgpu]
> >>>>    amdgpu_ib_schedule+0x306/0x550 [amdgpu]
> >>>>    amdgpu_job_run+0x10f/0x260 [amdgpu]
> >>>>    drm_sched_main+0x1b9/0x490 [gpu_sched]
> >>>>    kthread+0x12e/0x150
> >>>>
> >>>> Trouble right now is that lockdep only validates against GFP_FS, which
> >>>> would be good enough for shrinkers. But for mmu_notifiers we actually
> >>>> need !GFP_ATOMIC, since they can be called from any page laundering,
> >>>> even if GFP_NOFS or GFP_NOIO are set.
> >>>>
> >>>> I guess we should improve the lockdep annotations for
> >>>> fs_reclaim_acquire/release.
> >>>>
> >>>> Ofc real fix is to properly preallocate this fence and stuff it into
> >>>> the amdgpu job structure. But GFP_ATOMIC gets the lockdep splat out of
> >>>> the way.
> >>>>
> >>>> v2: Two more allocations in scheduler paths.
> >>>>
> >>>> Frist one:
> >>>>
> >>>>    __kmalloc+0x58/0x720
> >>>>    amdgpu_vmid_grab+0x100/0xca0 [amdgpu]
> >>>>    amdgpu_job_dependency+0xf9/0x120 [amdgpu]
> >>>>    drm_sched_entity_pop_job+0x3f/0x440 [gpu_sched]
> >>>>    drm_sched_main+0xf9/0x490 [gpu_sched]
> >>>>
> >>>> Second one:
> >>>>
> >>>>    kmem_cache_alloc+0x2b/0x6d0
> >>>>    amdgpu_sync_fence+0x7e/0x110 [amdgpu]
> >>>>    amdgpu_vmid_grab+0x86b/0xca0 [amdgpu]
> >>>>    amdgpu_job_dependency+0xf9/0x120 [amdgpu]
> >>>>    drm_sched_entity_pop_job+0x3f/0x440 [gpu_sched]
> >>>>    drm_sched_main+0xf9/0x490 [gpu_sched]
> >>>>
> >>>> Cc: linux-media@xxxxxxxxxxxxxxx
> >>>> Cc: linaro-mm-sig@xxxxxxxxxxxxxxxx
> >>>> Cc: linux-rdma@xxxxxxxxxxxxxxx
> >>>> Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> >>>> Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> >>>> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> >>>> Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
> >>>> Cc: Christian König <christian.koenig@xxxxxxx>
> >>>> Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
> >>> Has anyone from amd side started looking into how to fix this properly?
> >> Yeah I checked both and neither are any real problem.
> > I'm confused ... do you mean "no real problem fixing them" or "not
> > actually a real problem"?
>
> Both, at least the VMID stuff is trivial to avoid.
>
> And the fence allocation is extremely unlikely. E.g. when we allocate a
> new one we previously most likely just freed one already.

Yeah I think debugging we can avoid, just stop debugging if things get
hung up like that. So mempool for the hw fences should be perfectly
fine.

The vmid stuff I don't really understand enough, but the hw fence
stuff I think I grok, plus other scheduler users need that too from a
quick look. I might be tackling that one (maybe put the mempool
outright into drm_scheduler code as a helper), except if you have
patches already in the works. vmid I'll leave to you guys :-)

-Daniel

>
> >
> >>> I looked a bit into fixing this with mempool, and the big guarantee we
> >>> need is that
> >>> - there's a hard upper limit on how many allocations we minimally need to
> >>>     guarantee forward progress. And the entire vmid allocation and
> >>>     amdgpu_sync_fence stuff kinda makes me question that's a valid
> >>>     assumption.
> >> We do have hard upper limits for those.
> >>
> >> The VMID allocation could as well just return the fence instead of putting
> >> it into the sync object IIRC. So that just needs some cleanup and can avoid
> >> the allocation entirely.
> > Yeah embedding should be simplest solution of all.
> >
> >> The hardware fence is limited by the number of submissions we can have
> >> concurrently on the ring buffers, so also not a problem at all.
> > Ok that sounds good. Wrt releasing the memory again, is that also done
> > without any of the allocation-side locks held? I've seen some vmid manager
> > somewhere ...
>
> Well that's the issue. We can't guarantee that for the hardware fence
> memory since it could be that we hold another reference during debugging
> IIRC.
>
> Still looking if and how we could fix this. But as I said this problem
> is so extremely unlikely.
>
> Christian.
>
> > -Daniel
> >
> >> Regards,
> >> Christian.
> >>
> >>> - mempool_free must be called without any locks in the way which are held
> >>>     while we call mempool_alloc. Otherwise we again have a nice deadlock
> >>>     with no forward progress. I tried auditing that, but got lost in amdgpu
> >>>     and scheduler code. Some lockdep annotations for mempool.c might help,
> >>>     but they're not going to catch everything. Plus it would be again manual
> >>>     annotations because this is yet another cross-release issue. So not sure
> >>>     that helps at all.
> >>>
> >>> iow, not sure what to do here. Ideas?
> >>>
> >>> Cheers, Daniel
> >>>
> >>>> ---
> >>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 2 +-
> >>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c   | 2 +-
> >>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c  | 2 +-
> >>>>    3 files changed, 3 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >>>> index 8d84975885cd..a089a827fdfe 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> >>>> @@ -143,7 +143,7 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f,
> >>>>            uint32_t seq;
> >>>>            int r;
> >>>> -  fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_KERNEL);
> >>>> +  fence = kmem_cache_alloc(amdgpu_fence_slab, GFP_ATOMIC);
> >>>>            if (fence == NULL)
> >>>>                    return -ENOMEM;
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> >>>> index 267fa45ddb66..a333ca2d4ddd 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c
> >>>> @@ -208,7 +208,7 @@ static int amdgpu_vmid_grab_idle(struct amdgpu_vm *vm,
> >>>>            if (ring->vmid_wait && !dma_fence_is_signaled(ring->vmid_wait))
> >>>>                    return amdgpu_sync_fence(sync, ring->vmid_wait);
> >>>> -  fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_KERNEL);
> >>>> +  fences = kmalloc_array(sizeof(void *), id_mgr->num_ids, GFP_ATOMIC);
> >>>>            if (!fences)
> >>>>                    return -ENOMEM;
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> >>>> index 8ea6c49529e7..af22b526cec9 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> >>>> @@ -160,7 +160,7 @@ int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f)
> >>>>            if (amdgpu_sync_add_later(sync, f))
> >>>>                    return 0;
> >>>> -  e = kmem_cache_alloc(amdgpu_sync_slab, GFP_KERNEL);
> >>>> +  e = kmem_cache_alloc(amdgpu_sync_slab, GFP_ATOMIC);
> >>>>            if (!e)
> >>>>                    return -ENOMEM;
> >>>> --
> >>>> 2.27.0
> >>>>
>

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch