Re: [syzbot] [mm?] [bcachefs?] WARNING in lock_list_lru_of_memcg

Alan Huang <mmpgouride@xxxxxxxxx> · Tue, 18 Feb 2025 02:09:21 +0800

On Feb 18, 2025, at 01:12, Kairui Song <ryncsn@xxxxxxxxx> wrote:
> 
> On Mon, Feb 17, 2025 at 12:13 AM Kairui Song <ryncsn@xxxxxxxxx> wrote:
>> 
>> On Sat, Feb 15, 2025 at 7:24 AM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>>> 
>>> On Fri, 14 Feb 2025 10:11:19 -0800 syzbot <syzbot+38a0cbd267eff2d286ff@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>>> 
>>>> syzbot has found a reproducer for the following issue on:
>>> 
>>> Thanks.  I doubt if bcachefs is implicated in this?
>>> 
>>>> HEAD commit:    128c8f96eb86 Merge tag 'drm-fixes-2025-02-14' of https://g..
>>>> git tree:       upstream
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=148019a4580000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=c776e555cfbdb82d
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=38a0cbd267eff2d286ff
>>>> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12328bf8580000
>>>> 
>>>> Downloadable assets:
>>>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-128c8f96.raw.xz
>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/a97f78ac821e/vmlinux-128c8f96.xz
>>>> kernel image: https://storage.googleapis.com/syzbot-assets/f451cf16fc9f/bzImage-128c8f96.xz
>>>> mounted in repro: https://storage.googleapis.com/syzbot-assets/a7da783f97cf/mount_3.gz
>>>> 
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+38a0cbd267eff2d286ff@xxxxxxxxxxxxxxxxxxxxxxxxx
>>>> 
>>>> ------------[ cut here ]------------
>>>> WARNING: CPU: 0 PID: 5459 at mm/list_lru.c:96 lock_list_lru_of_memcg+0x39e/0x4d0 mm/list_lru.c:96
>>> 
>>>        VM_WARN_ON(!css_is_dying(&memcg->css));
>> 
>> I'm checking this, when last time this was triggered, it was caused by
>> a list_lru user did not initialize the memcg list_lru properly before
>> list_lru reclaim started, and fixed by:
>> https://lore.kernel.org/all/20241222122936.67501-1-ryncsn@xxxxxxxxx/T/
>> 
>> This shouldn't be a big issue, maybe there are leaks that will be
>> fixed upon reparenting, and this new added sanity check might be too
>> lenient, I'm not 100% sure though.
>> 
>> Unfortunately I couldn't reproduce the issue locally with the
>> reproducer yet. will keep the test running and see if it can hit this
>> WARN_ON.
> 
> So far I am still unable to trigger this VM_WARN_ON using the
> reproducer, and I'm seeing many other random crashes.
> 
> But after I changed the .config a bit adding more debug configs
> (SLAB_FREELIST_HARDENED, DEBUG_PAGEALLOC), following crash showed up
> and will be triggered immediately after I start the test:
> 
> [ T1242] BUG: unable to handle page fault for address: ffff888054c60000
> [ T1242] #PF: supervisor read access in kernel mode
> [ T1242] #PF: error_code(0x0000) - not-present page
> [ T1242] PGD 19e01067 P4D 19e01067 PUD 19e04067 PMD 7fc5c067 PTE
> 800fffffab39f060
> [ T1242] Oops: Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI
> [ T1242] CPU: 1 UID: 0 PID: 1242 Comm: kworker/1:1H Not tainted
> 6.14.0-rc2-00185-g128c8f96eb86 #2
> [ T1242] Hardware name: Red Hat KVM/RHEL-AV, BIOS
> 1.16.0-4.module+el8.8.0+664+0a3d6c83 04/01/2014
> [ T1242] Workqueue: bcachefs_btree_read_complete btree_node_read_work
> [ T1242] RIP: 0010:validate_bset_keys+0xae3/0x14f0
> [ T6058] bcachefs (loop2): empty btree root xattrs
> [ T1242] Code: 49 39 df 0f 87 fc 09 00 00 e8 79 54 a8 fd 41 0f b7 c6
> 48 8b 4c 24 68 48 8d 04 c1 4c 29 f8 48 c1 e8 03 89 c1 48 89 de 4c 89
> ff <f3> 48 a5 48 8b bc 24 c8 00 00 08
> [ T1242] RSP: 0018:ffffc900070a72c0 EFLAGS: 00010206
> [ T1242] RAX: 000000000000ec0f RBX: ffff888054c20110 RCX: 0000000000006c31
> [ T1242] RDX: 0000000000000000 RSI: ffff888054c60000 RDI: ffff888054c5ff90
> [ T1242] RBP: ffffc900070a7570 R08: ffff888065e001af R09: 1ffff1100cbc0035
> [ T1242] R10: dffffc0000000000 R11: ffffed100cbc0036 R12: ffff888054c2009e
> [ T1242] R13: dffffc0000000000 R14: 000000000000ec0f R15: ffff888054c200a0
> [ T1242] FS:  0000000000000000(0000) GS:ffff88807ea00000(0000)
> knlGS:0000000000000000
> [ T1242] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ T1242] CR2: ffff888054c60000 CR3: 000000006cea6000 CR4: 00000000000006f0
> [ T1242] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ T1242] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ T1242] Call Trace:
> [ T1242]  <TASK>
> [ T1242]  bch2_btree_node_read_done+0x1d20/0x53a0
> [ T1242]  btree_node_read_work+0x54d/0xdc0
> [ T1242]  process_scheduled_works+0xaf8/0x17f0
> [ T1242]  worker_thread+0x89d/0xd60
> [ T1242]  kthread+0x722/0x890
> [ T1242]  ret_from_fork+0x4e/0x80
> [ T1242]  ret_from_fork_asm+0x1a/0x30
> [ T1242]  </TASK>
> [ T1242] Modules linked in:
> [ T1242] ---[ end trace 0000000000000000 ]---
> [ T1242] RIP: 0010:validate_bset_keys+0xae3/0x14f0
> [ T1242] Code: 49 39 df 0f 87 fc 09 00 00 e8 79 54 a8 fd 41 0f b7 c6
> 48 8b 4c 24 68 48 8d 04 c1 4c 29 f8 48 c1 e8 03 89 c1 48 89 de 4c 89
> ff <f3> 48 a5 48 8b bc 24 c8 00 00 08
> [ T1242] RSP: 0018:ffffc900070a72c0 EFLAGS: 00010206
> [ T1242] RAX: 000000000000ec0f RBX: ffff888054c20110 RCX: 0000000000006c31
> [ T1242] RDX: 0000000000000000 RSI: ffff888054c60000 RDI: ffff888054c5ff90
> [ T1242] RBP: ffffc900070a7570 R08: ffff888065e001af R09: 1ffff1100cbc0035
> [ T1242] R10: dffffc0000000000 R11: ffffed100cbc0036 R12: ffff888054c2009e
> [ T1242] R13: dffffc0000000000 R14: 000000000000ec0f R15: ffff888054c200a0
> [ T1242] FS:  0000000000000000(0000) GS:ffff88807ea00000(0000)
> knlGS:0000000000000000
> [ T1242] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ T1242] CR2: ffff888054c60000 CR3: 000000006cea6000 CR4: 00000000000006f0
> [ T1242] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ T1242] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ T1242] Kernel panic - not syncing: Fatal exception
> [ T1242] Kernel Offset: disabled
> [ T1242] Rebooting in 86400 seconds..
> 
> It's caused by the memmove_u64s_down in validate_bset_keys of
> fs/bcachefs/btree_io.c:
> -> memmove_u64s_down(k, bkey_p_next(k), (u64 *) vstruct_end(i) - (u64 *) k);


Might need this.

diff --git a/fs/bcachefs/btree_io.c b/fs/bcachefs/btree_io.c
index e71b278672b6..fb53174cb735 100644
--- a/fs/bcachefs/btree_io.c
+++ b/fs/bcachefs/btree_io.c
@@ -997,7 +997,7 @@ static int validate_bset_keys(struct bch_fs *c, struct btree *b,
                }
 got_good_key:
                le16_add_cpu(&i->u64s, -next_good_key);
-               memmove_u64s_down(k, bkey_p_next(k), (u64 *) vstruct_end(i) - (u64 *) k);
+               memmove_u64s_down(k, bkey_p_next(k), (u64 *) vstruct_end(i) - (u64 *) bkey_p_next(k));
                set_btree_node_need_rewrite(b);
        }
 fsck_err:

> 
> The bkey_p_next(k) is RSI: ffff888054c60000 and it's causing an out of
> border access.
> (u64 *) vstruct_end(i) - (u64 *) k is RCX: 0000000000006c31, if added
> to RDI this should cause an out of border write as well.
> 
> This seems to indicate there is an out of border memory modification?
> And maybe it corrupted other subsystems? The slight change to .config
> changed the layout so it's causing a fault, maybe previously this just
> went on silently.
> I don't know much about bcachefs, will be grateful if bcachefs people
> could help have a look.
>