Re: [PATCH 2/4] drm/shmem: Use mutex_trylock in drm_gem_shmem_purge

Daniel Vetter <daniel@xxxxxxxx> · Wed, 21 Aug 2019 10:23:43 +0200

On Tue, Aug 20, 2019 at 07:35:47AM -0500, Rob Herring wrote:
> On Tue, Aug 20, 2019 at 4:05 AM Daniel Vetter <daniel@xxxxxxxx> wrote:
> >
> > On Mon, Aug 19, 2019 at 11:12:02AM -0500, Rob Herring wrote:
> > > Lockdep reports a circular locking dependency with pages_lock taken in
> > > the shrinker callback. The deadlock can't actually happen with current
> > > users at least as a BO will never be purgeable when pages_lock is held.
> > > To be safe, let's use mutex_trylock() instead and bail if a BO is locked
> > > already.
> > >
> > > WARNING: possible circular locking dependency detected
> > > 5.3.0-rc1+ #100 Tainted: G             L
> > > ------------------------------------------------------
> > > kswapd0/171 is trying to acquire lock:
> > > 000000009b9823fd (&shmem->pages_lock){+.+.}, at: drm_gem_shmem_purge+0x20/0x40
> > >
> > > but task is already holding lock:
> > > 00000000f82369b6 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x40
> > >
> > > which lock already depends on the new lock.
> > >
> > > the existing dependency chain (in reverse order) is:
> > >
> > > -> #1 (fs_reclaim){+.+.}:
> > >        fs_reclaim_acquire.part.18+0x34/0x40
> > >        fs_reclaim_acquire+0x20/0x28
> > >        __kmalloc_node+0x6c/0x4c0
> > >        kvmalloc_node+0x38/0xa8
> > >        drm_gem_get_pages+0x80/0x1d0
> > >        drm_gem_shmem_get_pages+0x58/0xa0
> > >        drm_gem_shmem_get_pages_sgt+0x48/0xd0
> > >        panfrost_mmu_map+0x38/0xf8 [panfrost]
> > >        panfrost_gem_open+0xc0/0xe8 [panfrost]
> > >        drm_gem_handle_create_tail+0xe8/0x198
> > >        drm_gem_handle_create+0x3c/0x50
> > >        panfrost_gem_create_with_handle+0x70/0xa0 [panfrost]
> > >        panfrost_ioctl_create_bo+0x48/0x80 [panfrost]
> > >        drm_ioctl_kernel+0xb8/0x110
> > >        drm_ioctl+0x244/0x3f0
> > >        do_vfs_ioctl+0xbc/0x910
> > >        ksys_ioctl+0x78/0xa8
> > >        __arm64_sys_ioctl+0x1c/0x28
> > >        el0_svc_common.constprop.0+0x90/0x168
> > >        el0_svc_handler+0x28/0x78
> > >        el0_svc+0x8/0xc
> > >
> > > -> #0 (&shmem->pages_lock){+.+.}:
> > >        __lock_acquire+0xa2c/0x1d70
> > >        lock_acquire+0xdc/0x228
> > >        __mutex_lock+0x8c/0x800
> > >        mutex_lock_nested+0x1c/0x28
> > >        drm_gem_shmem_purge+0x20/0x40
> > >        panfrost_gem_shrinker_scan+0xc0/0x180 [panfrost]
> > >        do_shrink_slab+0x208/0x500
> > >        shrink_slab+0x10c/0x2c0
> > >        shrink_node+0x28c/0x4d8
> > >        balance_pgdat+0x2c8/0x570
> > >        kswapd+0x22c/0x638
> > >        kthread+0x128/0x130
> > >        ret_from_fork+0x10/0x18
> > >
> > > other info that might help us debug this:
> > >
> > >  Possible unsafe locking scenario:
> > >
> > >        CPU0                    CPU1
> > >        ----                    ----
> > >   lock(fs_reclaim);
> > >                                lock(&shmem->pages_lock);
> > >                                lock(fs_reclaim);
> > >   lock(&shmem->pages_lock);
> > >
> > >  *** DEADLOCK ***
> > >
> > > 3 locks held by kswapd0/171:
> > >  #0: 00000000f82369b6 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x40
> > >  #1: 00000000ceb37808 (shrinker_rwsem){++++}, at: shrink_slab+0xbc/0x2c0
> > >  #2: 00000000f31efa81 (&pfdev->shrinker_lock){+.+.}, at: panfrost_gem_shrinker_scan+0x34/0x180 [panfrost]
> > >
> > > Fixes: 17acb9f35ed7 ("drm/shmem: Add madvise state and purge helpers")
> > > Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
> > > Cc: Maxime Ripard <maxime.ripard@xxxxxxxxxxx>
> > > Cc: Sean Paul <sean@xxxxxxxxxx>
> > > Cc: David Airlie <airlied@xxxxxxxx>
> > > Cc: Daniel Vetter <daniel@xxxxxxxx>
> > > Signed-off-by: Rob Herring <robh@xxxxxxxxxx>
> > > ---
> > >  drivers/gpu/drm/drm_gem_shmem_helper.c | 7 +++++--
> > >  include/drm/drm_gem_shmem_helper.h     | 2 +-
> > >  2 files changed, 6 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > > index 5423ec56b535..f5918707672f 100644
> > > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > > @@ -415,13 +415,16 @@ void drm_gem_shmem_purge_locked(struct drm_gem_object *obj)
> > >  }
> > >  EXPORT_SYMBOL(drm_gem_shmem_purge_locked);
> > >
> > > -void drm_gem_shmem_purge(struct drm_gem_object *obj)
> > > +bool drm_gem_shmem_purge(struct drm_gem_object *obj)
> > >  {
> > >       struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
> > >
> > > -     mutex_lock(&shmem->pages_lock);
> > > +     if (!mutex_trylock(&shmem->pages_lock))
> >
> > Did you see my ping about cutting all the locking over to dma_resv?
> 
> Yes, but you didn't reply to Rob C. about it. I guess I'll have to go
> figure out how reservation objects work...

msm was the last driver that still used struct_mutex. It's a long-term
dead-end, and I think with all the effort recently to create helpers for
rendering drivers (shmem, vram, ttm refactoring) we should make a solid
attempt to get aligned. Or did you mean that Rob Clark had some
reply/questions that I didn' respond to because it fell through cracks?

> > Would
> > align shmem helpers with ttm a lot more, for that bright glorious future
> > taste. Should we capture that in some todo.rst entry?
> 
> Sure.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel