Re: xfs : WARNING: possible circular locking dependency detected

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 18 Apr 2024 15:04:24 +1000

On Thu, Apr 18, 2024 at 11:39:25AM +0800, Xiubo Li wrote:
> Hi all
> 
> BTW, is this a known issue and has it been fixed already ? I can reproduce
> this always with my VMs:
> 
> 
> <4>[ 9009.171195]
> <4>[ 9009.171205] ======================================================
> <4>[ 9009.171208] WARNING: possible circular locking dependency detected
> <4>[ 9009.171211] 6.9.0-rc3+ #49 Not tainted
> <4>[ 9009.171214] ------------------------------------------------------
> <4>[ 9009.171216] kswapd0/149 is trying to acquire lock:
> <4>[ 9009.171219] ffff88811346a920 (&xfs_nondir_ilock_class){++++}-{4:4},
> at: xfs_reclaim_inode+0x3ac/0x590 [xfs]
> <4>[ 9009.171580]
> <4>[ 9009.171580] but task is already holding lock:
> <4>[ 9009.171583] ffffffff8bb33100 (fs_reclaim){+.+.}-{0:0}, at:
> balance_pgdat+0x5d9/0xad0
> <4>[ 9009.171593]
> <4>[ 9009.171593] which lock already depends on the new lock.
> <4>[ 9009.171593]
> <4>[ 9009.171595]
> <4>[ 9009.171595] the existing dependency chain (in reverse order) is:
> <4>[ 9009.171597]
> <4>[ 9009.171597] -> #1 (fs_reclaim){+.+.}-{0:0}:
> <4>[ 9009.171603]        __lock_acquire+0x7da/0x1030
> <4>[ 9009.171610]        lock_acquire+0x15d/0x400
> <4>[ 9009.171614]        fs_reclaim_acquire+0xb5/0x100
> <4>[ 9009.171618] prepare_alloc_pages.constprop.0+0xc5/0x230
> <4>[ 9009.171622]        __alloc_pages+0x12a/0x3f0
> <4>[ 9009.171625]        alloc_pages_mpol+0x175/0x340
> <4>[ 9009.171630]        stack_depot_save_flags+0x4c5/0x510
> <4>[ 9009.171635]        kasan_save_stack+0x30/0x40
> <4>[ 9009.171640]        kasan_save_track+0x10/0x30
> <4>[ 9009.171643]        __kasan_slab_alloc+0x83/0x90
> <4>[ 9009.171646]        kmem_cache_alloc+0x15e/0x4a0
> <4>[ 9009.171652]        __alloc_object+0x35/0x370
> <4>[ 9009.171659]        __create_object+0x22/0x90
> <4>[ 9009.171665] __kmalloc_node_track_caller+0x477/0x5b0
> <4>[ 9009.171672]        krealloc+0x5f/0x110
> <4>[ 9009.171679]        xfs_iext_insert_raw+0x4b2/0x6e0 [xfs]
> <4>[ 9009.172172]        xfs_iext_insert+0x2e/0x130 [xfs]

The only krealloc() in this path is:

	new = krealloc(ifp->if_data, new_size,
                        GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL);

And it explicitly uses __GFP_NOLOCKDEP to tell lockdep not to warn
about this allocation because of this false positive situation.

Oh. I've seen this before. This is a KASAN bug, and I'm pretty sure
I've posted a patch to fix it a fair while back that nobody seemed
to care about enough to review or merge it.

That is: kasan_save_stack() is doing a fixed GFP_KERNEL allocation
in an context where GFP_KERNEL allocations are known to generate
lockdep false positives.  This occurs depsite the XFS and general
memory allocation code doing exactly the right thing to avoid the
lockdep false positives (i.e. using and obeying __GFP_NOLOCKDEP).

The kasan code ends up in stack_depot_save_flags(), which does a
GFP_KERNEL allocation but filters out __GFP_NOLOCKDEP and does not
add it back. Hence kasan generates the false positive lockdep
warnings, not the code doing the original allocation.

kasan and/or stack_depot_save_flags() needs fixing here.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx