On Tue, Oct 1, 2024 at 6:53 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > On Mon, Sep 30, 2024 at 11:44:06AM +0800, Yafang Shao wrote: > > I encountered the following error messages on our test servers: > > > > [ 2553.303035] ====================================================== > > [ 2553.303692] WARNING: possible circular locking dependency detected > > [ 2553.304363] 6.11.0+ #27 Not tainted > > [ 2553.304732] ------------------------------------------------------ > > [ 2553.305398] python/129251 is trying to acquire lock: > > [ 2553.305940] ffff89b18582e318 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_ilock+0x70/0x190 [xfs] > > [ 2553.307066] > > but task is already holding lock: > > [ 2553.307682] ffffffffb4324de0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x368/0xb10 > > [ 2553.308670] > > which lock already depends on the new lock. > > ..... > > > [ 2553.342664] Possible unsafe locking scenario: > > > > [ 2553.343621] CPU0 CPU1 > > [ 2553.344300] ---- ---- > > [ 2553.344957] lock(fs_reclaim); > > [ 2553.345510] lock(&xfs_nondir_ilock_class); > > [ 2553.346326] lock(fs_reclaim); > > [ 2553.347015] rlock(&xfs_nondir_ilock_class); > > [ 2553.347639] > > *** DEADLOCK *** > > > > The deadlock is as follows, > > > > CPU0 CPU1 > > ------ ------ > > > > alloc_anon_folio() > > vma_alloc_folio(__GFP_FS) > > fs_reclaim_acquire(__GFP_FS); > > __fs_reclaim_acquire(); > > > > xfs_attr_list() > > xfs_ilock() > > kmalloc(__GFP_FS); > > __fs_reclaim_acquire(); > > > > xfs_ilock > > Yet another lockdep false positive. listxattr() is not in a > transaction context on a referenced inode, so GFP_KERNEL is correct. > The problem is lockdep has no clue that fs_reclaim context can only > lock unreferenced inodes, so we can actualy run GFP_KERNEL context > memory allocation with a locked, referenced inode safely. Thanks for your detailed explanation. > > We typically use __GFP_NOLOCKDEP on these sorts of allocations, but > the long term fix is to address the lockdep annotations to take > reclaim context into account. We can't do that until the realtime > inode subclasses are removed which will give use the spare lockdep > subclasses to add a reclaim context subclass. That is buried in the > middle of a much large rework: > > https://lore.kernel.org/linux-xfs/172437087542.59588.13853236455832390956.stgit@frogsfrogsfrogs/ Thank you for the reference link. While I’m not able to review the patchset in detail, I’ll read through it to gain more understanding. -- Regards Yafang