Re: [syzbot] [mm?] [bcachefs?] WARNING in lock_list_lru_of_memcg

Kairui Song <ryncsn@xxxxxxxxx> · Tue, 18 Feb 2025 19:40:57 +0800

On Tue, Feb 18, 2025 at 2:09 AM Alan Huang <mmpgouride@xxxxxxxxx> wrote:
>
> On Feb 18, 2025, at 01:12, Kairui Song <ryncsn@xxxxxxxxx> wrote:
> >
> > On Mon, Feb 17, 2025 at 12:13 AM Kairui Song <ryncsn@xxxxxxxxx> wrote:
> >>
> >> On Sat, Feb 15, 2025 at 7:24 AM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> >>>
> >>> On Fri, 14 Feb 2025 10:11:19 -0800 syzbot <syzbot+38a0cbd267eff2d286ff@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >>>
> >>>> syzbot has found a reproducer for the following issue on:
> >>>
> >>> Thanks.  I doubt if bcachefs is implicated in this?
> >>>
> >>>> HEAD commit:    128c8f96eb86 Merge tag 'drm-fixes-2025-02-14' of https://g..
> >>>> git tree:       upstream
> >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=148019a4580000
> >>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=c776e555cfbdb82d
> >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=38a0cbd267eff2d286ff
> >>>> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> >>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12328bf8580000
> >>>>
> >>>> Downloadable assets:
> >>>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-128c8f96.raw.xz
> >>>> vmlinux: https://storage.googleapis.com/syzbot-assets/a97f78ac821e/vmlinux-128c8f96.xz
> >>>> kernel image: https://storage.googleapis.com/syzbot-assets/f451cf16fc9f/bzImage-128c8f96.xz
> >>>> mounted in repro: https://storage.googleapis.com/syzbot-assets/a7da783f97cf/mount_3.gz
> >>>>
> >>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>>> Reported-by: syzbot+38a0cbd267eff2d286ff@xxxxxxxxxxxxxxxxxxxxxxxxx
> >>>>
> >>>> ------------[ cut here ]------------
> >>>> WARNING: CPU: 0 PID: 5459 at mm/list_lru.c:96 lock_list_lru_of_memcg+0x39e/0x4d0 mm/list_lru.c:96
> >>>
> >>>        VM_WARN_ON(!css_is_dying(&memcg->css));
> >>
> >> I'm checking this, when last time this was triggered, it was caused by
> >> a list_lru user did not initialize the memcg list_lru properly before
> >> list_lru reclaim started, and fixed by:
> >> https://lore.kernel.org/all/20241222122936.67501-1-ryncsn@xxxxxxxxx/T/
> >>
> >> This shouldn't be a big issue, maybe there are leaks that will be
> >> fixed upon reparenting, and this new added sanity check might be too
> >> lenient, I'm not 100% sure though.
> >>
> >> Unfortunately I couldn't reproduce the issue locally with the
> >> reproducer yet. will keep the test running and see if it can hit this
> >> WARN_ON.
> >
> > So far I am still unable to trigger this VM_WARN_ON using the
> > reproducer, and I'm seeing many other random crashes.
> >
> > But after I changed the .config a bit adding more debug configs
> > (SLAB_FREELIST_HARDENED, DEBUG_PAGEALLOC), following crash showed up
> > and will be triggered immediately after I start the test:
> >
> > [ T1242] BUG: unable to handle page fault for address: ffff888054c60000
> > [ T1242] #PF: supervisor read access in kernel mode
> > [ T1242] #PF: error_code(0x0000) - not-present page
> > [ T1242] PGD 19e01067 P4D 19e01067 PUD 19e04067 PMD 7fc5c067 PTE
> > 800fffffab39f060
> > [ T1242] Oops: Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI
> > [ T1242] CPU: 1 UID: 0 PID: 1242 Comm: kworker/1:1H Not tainted
> > 6.14.0-rc2-00185-g128c8f96eb86 #2
> > [ T1242] Hardware name: Red Hat KVM/RHEL-AV, BIOS
> > 1.16.0-4.module+el8.8.0+664+0a3d6c83 04/01/2014
> > [ T1242] Workqueue: bcachefs_btree_read_complete btree_node_read_work
> > [ T1242] RIP: 0010:validate_bset_keys+0xae3/0x14f0
> > [ T6058] bcachefs (loop2): empty btree root xattrs
> > [ T1242] Code: 49 39 df 0f 87 fc 09 00 00 e8 79 54 a8 fd 41 0f b7 c6
> > 48 8b 4c 24 68 48 8d 04 c1 4c 29 f8 48 c1 e8 03 89 c1 48 89 de 4c 89
> > ff <f3> 48 a5 48 8b bc 24 c8 00 00 08
> > [ T1242] RSP: 0018:ffffc900070a72c0 EFLAGS: 00010206
> > [ T1242] RAX: 000000000000ec0f RBX: ffff888054c20110 RCX: 0000000000006c31
> > [ T1242] RDX: 0000000000000000 RSI: ffff888054c60000 RDI: ffff888054c5ff90
> > [ T1242] RBP: ffffc900070a7570 R08: ffff888065e001af R09: 1ffff1100cbc0035
> > [ T1242] R10: dffffc0000000000 R11: ffffed100cbc0036 R12: ffff888054c2009e
> > [ T1242] R13: dffffc0000000000 R14: 000000000000ec0f R15: ffff888054c200a0
> > [ T1242] FS:  0000000000000000(0000) GS:ffff88807ea00000(0000)
> > knlGS:0000000000000000
> > [ T1242] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ T1242] CR2: ffff888054c60000 CR3: 000000006cea6000 CR4: 00000000000006f0
> > [ T1242] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ T1242] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ T1242] Call Trace:
> > [ T1242]  <TASK>
> > [ T1242]  bch2_btree_node_read_done+0x1d20/0x53a0
> > [ T1242]  btree_node_read_work+0x54d/0xdc0
> > [ T1242]  process_scheduled_works+0xaf8/0x17f0
> > [ T1242]  worker_thread+0x89d/0xd60
> > [ T1242]  kthread+0x722/0x890
> > [ T1242]  ret_from_fork+0x4e/0x80
> > [ T1242]  ret_from_fork_asm+0x1a/0x30
> > [ T1242]  </TASK>
> > [ T1242] Modules linked in:
> > [ T1242] ---[ end trace 0000000000000000 ]---
> > [ T1242] RIP: 0010:validate_bset_keys+0xae3/0x14f0
> > [ T1242] Code: 49 39 df 0f 87 fc 09 00 00 e8 79 54 a8 fd 41 0f b7 c6
> > 48 8b 4c 24 68 48 8d 04 c1 4c 29 f8 48 c1 e8 03 89 c1 48 89 de 4c 89
> > ff <f3> 48 a5 48 8b bc 24 c8 00 00 08
> > [ T1242] RSP: 0018:ffffc900070a72c0 EFLAGS: 00010206
> > [ T1242] RAX: 000000000000ec0f RBX: ffff888054c20110 RCX: 0000000000006c31
> > [ T1242] RDX: 0000000000000000 RSI: ffff888054c60000 RDI: ffff888054c5ff90
> > [ T1242] RBP: ffffc900070a7570 R08: ffff888065e001af R09: 1ffff1100cbc0035
> > [ T1242] R10: dffffc0000000000 R11: ffffed100cbc0036 R12: ffff888054c2009e
> > [ T1242] R13: dffffc0000000000 R14: 000000000000ec0f R15: ffff888054c200a0
> > [ T1242] FS:  0000000000000000(0000) GS:ffff88807ea00000(0000)
> > knlGS:0000000000000000
> > [ T1242] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ T1242] CR2: ffff888054c60000 CR3: 000000006cea6000 CR4: 00000000000006f0
> > [ T1242] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ T1242] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ T1242] Kernel panic - not syncing: Fatal exception
> > [ T1242] Kernel Offset: disabled
> > [ T1242] Rebooting in 86400 seconds..
> >
> > It's caused by the memmove_u64s_down in validate_bset_keys of
> > fs/bcachefs/btree_io.c:
> > -> memmove_u64s_down(k, bkey_p_next(k), (u64 *) vstruct_end(i) - (u64 *) k);
>
>
> Might need this.
>
> diff --git a/fs/bcachefs/btree_io.c b/fs/bcachefs/btree_io.c
> index e71b278672b6..fb53174cb735 100644
> --- a/fs/bcachefs/btree_io.c
> +++ b/fs/bcachefs/btree_io.c
> @@ -997,7 +997,7 @@ static int validate_bset_keys(struct bch_fs *c, struct btree *b,
>                 }
>  got_good_key:
>                 le16_add_cpu(&i->u64s, -next_good_key);
> -               memmove_u64s_down(k, bkey_p_next(k), (u64 *) vstruct_end(i) - (u64 *) k);
> +               memmove_u64s_down(k, bkey_p_next(k), (u64 *) vstruct_end(i) - (u64 *) bkey_p_next(k));
>                 set_btree_node_need_rewrite(b);
>         }
>  fsck_err:
>

Thanks, but this didn't fix everything. I think the problem is more
complex, syzbot seems to be trying to mount damaged bcachefs (on
purpose I think), so the vstruct_end(i) is already returning an offset
that is out of border.

I retriggered it and print some more debug info: i->_data is
ffff88806d5c00a0, i->u64s is 60928, and the faulting address is
ffff88806d600000.