Re: [syzbot] [ext4?] possible deadlock in evict (3)

Dave Chinner <david@xxxxxxxxxxxxx> · Wed, 1 Mar 2023 11:01:42 +1100

[obvious one for the ext4 people]

On Tue, Feb 28, 2023 at 09:25:55AM -0800, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    ae3419fbac84 vc_screen: don't clobber return value in vcs_..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1136fe18c80000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=ff98a3b3c1aed3ab
> dashboard link: https://syzkaller.appspot.com/bug?extid=dd426ae4af71f1e74729
> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+dd426ae4af71f1e74729@xxxxxxxxxxxxxxxxxxxxxxxxx
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.2.0-syzkaller-12913-gae3419fbac84 #0 Not tainted
> ------------------------------------------------------
> kswapd0/100 is trying to acquire lock:
> ffff888047aea650 (sb_internal){.+.+}-{0:0}, at: evict+0x2ed/0x6b0 fs/inode.c:665
> 
> but task is already holding lock:
> ffffffff8c8e29e0 (fs_reclaim){+.+.}-{0:0}, at: set_task_reclaim_state mm/vmscan.c:200 [inline]
> ffffffff8c8e29e0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x170/0x1ac0 mm/vmscan.c:7338
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #3 (fs_reclaim){+.+.}-{0:0}:
>        __fs_reclaim_acquire mm/page_alloc.c:4716 [inline]
>        fs_reclaim_acquire+0x11d/0x160 mm/page_alloc.c:4730
>        might_alloc include/linux/sched/mm.h:271 [inline]
>        prepare_alloc_pages+0x159/0x570 mm/page_alloc.c:5362
>        __alloc_pages+0x149/0x5c0 mm/page_alloc.c:5580
>        alloc_pages+0x1aa/0x270 mm/mempolicy.c:2283
>        __get_free_pages+0xc/0x40 mm/page_alloc.c:5641
>        kasan_populate_vmalloc_pte mm/kasan/shadow.c:309 [inline]
>        kasan_populate_vmalloc_pte+0x27/0x150 mm/kasan/shadow.c:300
>        apply_to_pte_range mm/memory.c:2578 [inline]
>        apply_to_pmd_range mm/memory.c:2622 [inline]
>        apply_to_pud_range mm/memory.c:2658 [inline]
>        apply_to_p4d_range mm/memory.c:2694 [inline]
>        __apply_to_page_range+0x68c/0x1030 mm/memory.c:2728
>        alloc_vmap_area+0x536/0x1f20 mm/vmalloc.c:1638
>        __get_vm_area_node+0x145/0x3f0 mm/vmalloc.c:2495
>        __vmalloc_node_range+0x250/0x1300 mm/vmalloc.c:3141
>        kvmalloc_node+0x156/0x1a0 mm/util.c:628
>        kvmalloc include/linux/slab.h:737 [inline]
>        ext4_xattr_move_to_block fs/ext4/xattr.c:2570 [inline]

	buffer = kvmalloc(value_size, GFP_NOFS);

Yeah, this doesn't work like the code says it should. The gfp mask
is not passed down to the page table population code and it hard
codes GFP_KERNEL allocations so you have to do:

	memalloc_nofs_save();
	buffer = kvmalloc(value_size, GFP_KERNEL);
	memalloc_nofs_restore();

to apply GFP_NOFS to allocations in the pte population code to avoid
memory reclaim recursion in kvmalloc.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx