Re: [syzbot] [mm?] WARNING in lock_list_lru_of_memcg

Kairui Song <ryncsn@xxxxxxxxx> · Wed, 18 Dec 2024 02:19:32 +0800

Thanks! Looking

Sasha Levin <sashal@xxxxxxxxxx> 于 2024年12月17日周二 02:39写道：
>
> On Sun, Dec 15, 2024 at 07:45:38PM -0700, Yu Zhao wrote:
> >Hi Kairui,
> >
> >On Sun, Dec 15, 2024 at 10:45 AM Kairui Song <ryncsn@xxxxxxxxx> wrote:
> >>
> >> On Sun, Dec 15, 2024 at 3:43 AM Kairui Song <ryncsn@xxxxxxxxx> wrote:
> >> >
> >> > On Sat, Dec 14, 2024 at 2:06 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
> >> > >
> >> > > On Fri, Dec 13, 2024 at 8:56 PM syzbot
> >> > > <syzbot+38a0cbd267eff2d286ff@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >> > > >
> >> > > > Hello,
> >> > > >
> >> > > > syzbot found the following issue on:
> >> > > >
> >> > > > HEAD commit:    7cb1b4663150 Merge tag 'locking_urgent_for_v6.13_rc3' of g..
> >> > > > git tree:       upstream
> >> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=16e96b30580000
> >> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=fee25f93665c89ac
> >> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=38a0cbd267eff2d286ff
> >> > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> >> > > >
> >> > > > Unfortunately, I don't have any reproducer for this issue yet.
> >> > > >
> >> > > > Downloadable assets:
> >> > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-7cb1b466.raw.xz
> >> > > > vmlinux: https://storage.googleapis.com/syzbot-assets/13e083329dab/vmlinux-7cb1b466.xz
> >> > > > kernel image: https://storage.googleapis.com/syzbot-assets/fe3847d08513/bzImage-7cb1b466.xz
> >> > > >
> >> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >> > > > Reported-by: syzbot+38a0cbd267eff2d286ff@xxxxxxxxxxxxxxxxxxxxxxxxx
> >> > > >
> >> > > > ------------[ cut here ]------------
> >> > > > WARNING: CPU: 0 PID: 80 at mm/list_lru.c:97 lock_list_lru_of_memcg+0x395/0x4e0 mm/list_lru.c:97
> >> > > > Modules linked in:
> >> > > > CPU: 0 UID: 0 PID: 80 Comm: kswapd0 Not tainted 6.13.0-rc2-syzkaller-00018-g7cb1b4663150 #0
> >> > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> >> > > > RIP: 0010:lock_list_lru_of_memcg+0x395/0x4e0 mm/list_lru.c:97
> >> > > > Code: e9 22 fe ff ff e8 9b cc b6 ff 4c 8b 7c 24 10 45 84 f6 0f 84 40 ff ff ff e9 37 01 00 00 e8 83 cc b6 ff eb 05 e8 7c cc b6 ff 90 <0f> 0b 90 eb 97 89 e9 80 e1 07 80 c1 03 38 c1 0f 8c 7a fd ff ff 48
> >> > > > RSP: 0018:ffffc9000105e798 EFLAGS: 00010093
> >> > > > RAX: ffffffff81e891c4 RBX: 0000000000000000 RCX: ffff88801f53a440
> >> > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> >> > > > RBP: ffff888042e70054 R08: ffffffff81e89156 R09: 1ffffffff2032cae
> >> > > > R10: dffffc0000000000 R11: fffffbfff2032caf R12: ffffffff81e88e5e
> >> > > > R13: ffffffff9a3feb20 R14: 0000000000000000 R15: ffff888042e70000
> >> > > > FS:  0000000000000000(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> >> > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> > > > CR2: 0000000020161000 CR3: 0000000032d12000 CR4: 0000000000352ef0
> >> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> > > > Call Trace:
> >> > > >  <TASK>
> >> > > >  list_lru_add+0x59/0x270 mm/list_lru.c:164
> >> > > >  list_lru_add_obj+0x17b/0x250 mm/list_lru.c:187
> >> > > >  workingset_update_node+0x1af/0x230 mm/workingset.c:634
> >> > > >  xas_update lib/xarray.c:355 [inline]
> >> > > >  update_node lib/xarray.c:758 [inline]
> >> > > >  xas_store+0xb8f/0x1890 lib/xarray.c:845
> >> > > >  page_cache_delete mm/filemap.c:149 [inline]
> >> > > >  __filemap_remove_folio+0x4e9/0x670 mm/filemap.c:232
> >> > > >  __remove_mapping+0x86f/0xad0 mm/vmscan.c:791
> >> > > >  shrink_folio_list+0x30a6/0x5ca0 mm/vmscan.c:1467
> >> > > >  evict_folios+0x3c86/0x5800 mm/vmscan.c:4593
> >> > > >  try_to_shrink_lruvec+0x9a6/0xc70 mm/vmscan.c:4789
> >> > > >  shrink_one+0x3b9/0x850 mm/vmscan.c:4834
> >> > > >  shrink_many mm/vmscan.c:4897 [inline]
> >> > > >  lru_gen_shrink_node mm/vmscan.c:4975 [inline]
> >> > > >  shrink_node+0x37c5/0x3e50 mm/vmscan.c:5956
> >> > > >  kswapd_shrink_node mm/vmscan.c:6785 [inline]
> >> > > >  balance_pgdat mm/vmscan.c:6977 [inline]
> >> > > >  kswapd+0x1ca9/0x36f0 mm/vmscan.c:7246
> >> > > >  kthread+0x2f0/0x390 kernel/kthread.c:389
> >> > > >  ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> >> > > >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> >> > > >  </TASK>
> >> > >
> >> > > This one seems to be related to "mm/list_lru: split the lock to
> >> > > per-cgroup scope".
> >> > >
> >> > > Kairui, can you please take a look? Thanks.
> >> >
> >> > Thanks for pinging, yes that's a new sanity check added by me.
> >> >
> >> > Which is supposed to mean, a list_lru is being reparented while the
> >> > memcg it belongs to isn't dying.
> >> >
> >> > More concretely, list_lru is marked dead by memcg_offline_kmem ->
> >> > memcg_reparent_list_lrus, if the function is called for one memcg, but
> >> > now the memcg is not dying, this WARN triggers. I'm not sure how this
> >> > is caused. One possibility is if alloc_shrinker_info() in
> >> > mem_cgroup_css_online failed, then memcg_offline_kmem is called early?
> >> > Doesn't seem to fit this case though.. Or maybe just sync issues with
> >> > the memcg dying flag so the user saw the list_lru dying before seeing
> >> > memcg dying? The object might be leaked to the parent cgroup, seems
> >> > not too terrible though.
> >> >
> >> > I'm not sure how to reproduce this. I will keep looking.
> >>
> >> Managed to boot the image and using the kernel config provided by bot,
> >> so far local tests didn't trigger any issue. Is there any way I can
> >> reproduce what the bot actually did?
> >
> >If syzbot doesn't have a repro, it might not be productive for you to
> >try to find one. Personally, I would analyze stacktraces and double
> >check the code, and move on if I can't find something obviously wrong.
> >
> >> Or provide some patch for the bot
> >> to test?
> >
> >syzbot only can try patches after it finds a repro. So in this case,
> >no, it can't try your patches.
> >
> >Hope the above clarifies things for you.
>
> Chiming in here as LKFT seems to be able to hit a nearby warning on
> boot.
>
> The link below contains the full log as well as additional information
> on the run.
>
> https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.13-rc2-232-g4800575d8c0b/testrun/26323524/suite/log-parser-test/test/exception-warning-cpu-pid-at-mmlist_lruc-list_lru_del/details/
>

Thanks for the info, I'm trying to reproduce and checking the code.

There were similar WARN_ON s some years ago and these WARN_ON was
removed by commit 2788cf0c401c allowing nr_items to become a wrong
value, but as that commit message mentioned, that should not be a
problem. I added these back because the new lock_list_lru_of_memcg
should ensure a stable list_lru, so they might help catch wrong usage.
There could be some corner cases or synchronization issues that are
not well considered for these sanity checks, I'm looking at it. An
bold fix is just remove these WARN_ON as such wrong values might not
be harmful. I'll do more checks and tests locally and report back.

> --
> Thanks,
> Sasha