Re: [PATCH] xfs: Fix circular locking during xfs inode reclamation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 1, 2024 at 6:53 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>
> On Mon, Sep 30, 2024 at 11:44:06AM +0800, Yafang Shao wrote:
> > I encountered the following error messages on our test servers:
> >
> > [ 2553.303035] ======================================================
> > [ 2553.303692] WARNING: possible circular locking dependency detected
> > [ 2553.304363] 6.11.0+ #27 Not tainted
> > [ 2553.304732] ------------------------------------------------------
> > [ 2553.305398] python/129251 is trying to acquire lock:
> > [ 2553.305940] ffff89b18582e318 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_ilock+0x70/0x190 [xfs]
> > [ 2553.307066]
> > but task is already holding lock:
> > [ 2553.307682] ffffffffb4324de0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x368/0xb10
> > [ 2553.308670]
> > which lock already depends on the new lock.
>
> .....
>
> > [ 2553.342664]  Possible unsafe locking scenario:
> >
> > [ 2553.343621]        CPU0                    CPU1
> > [ 2553.344300]        ----                    ----
> > [ 2553.344957]   lock(fs_reclaim);
> > [ 2553.345510]                                lock(&xfs_nondir_ilock_class);
> > [ 2553.346326]                                lock(fs_reclaim);
> > [ 2553.347015]   rlock(&xfs_nondir_ilock_class);
> > [ 2553.347639]
> >  *** DEADLOCK ***
> >
> > The deadlock is as follows,
> >
> >     CPU0                                  CPU1
> >    ------                                ------
> >
> >   alloc_anon_folio()
> >     vma_alloc_folio(__GFP_FS)
> >      fs_reclaim_acquire(__GFP_FS);
> >        __fs_reclaim_acquire();
> >
> >                                     xfs_attr_list()
> >                                       xfs_ilock()
> >                                       kmalloc(__GFP_FS);
> >                                         __fs_reclaim_acquire();
> >
> >        xfs_ilock
>
> Yet another lockdep false positive. listxattr() is not in a
> transaction context on a referenced inode, so GFP_KERNEL is correct.
> The problem is lockdep has no clue that fs_reclaim context can only
> lock unreferenced inodes, so we can actualy run GFP_KERNEL context
> memory allocation with a locked, referenced inode safely.

Thanks for your detailed explanation.

>
> We typically use __GFP_NOLOCKDEP on these sorts of allocations, but
> the long term fix is to address the lockdep annotations to take
> reclaim context into account. We can't do that until the realtime
> inode subclasses are removed which will give use the spare lockdep
> subclasses to add a reclaim context subclass. That is buried in the
> middle of a much large rework:
>
> https://lore.kernel.org/linux-xfs/172437087542.59588.13853236455832390956.stgit@frogsfrogsfrogs/

Thank you for the reference link. While I’m not able to review the
patchset in detail, I’ll read through it to gain more understanding.

-- 
Regards
Yafang





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux