On Thu, Jun 27, 2024 at 9:31 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > On Thu, Jun 27, 2024 at 07:03:21AM -0700, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 7c16f0a4ed1c Merge tag 'i2c-for-6.10-rc5' of git://git.ker.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=1511528e980000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=12f98862a3c0c799 > > dashboard link: https://syzkaller.appspot.com/bug?extid=b7f13b2d0cc156edf61a > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/50560e9024e5/disk-7c16f0a4.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/080c27daee72/vmlinux-7c16f0a4.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/c528e0da4544/bzImage-7c16f0a4.xz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+b7f13b2d0cc156edf61a@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > BUG: sleeping function called from invalid context at kernel/cgroup/rstat.c:351 > > in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 17332, name: syz-executor.4 > > preempt_count: 0, expected: 0 > > RCU nest depth: 1, expected: 0 > > 1 lock held by syz-executor.4/17332: > > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline] > > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:781 [inline] > > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: filemap_cachestat mm/filemap.c:4251 [inline] > > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: __do_sys_cachestat mm/filemap.c:4407 [inline] > > #0: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: __se_sys_cachestat+0x3ee/0xbb0 mm/filemap.c:4372 > > CPU: 1 PID: 17332 Comm: syz-executor.4 Not tainted 6.10.0-rc4-syzkaller-00330-g7c16f0a4ed1c #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024 > > Call Trace: > > <TASK> > > __dump_stack lib/dump_stack.c:88 [inline] > > dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 > > __might_resched+0x5d4/0x780 kernel/sched/core.c:10196 > > cgroup_rstat_flush+0x1e/0x50 kernel/cgroup/rstat.c:351 > > workingset_test_recent+0x48a/0xa90 mm/workingset.c:473 > > filemap_cachestat mm/filemap.c:4314 [inline] > > __do_sys_cachestat mm/filemap.c:4407 [inline] > > __se_sys_cachestat+0x795/0xbb0 mm/filemap.c:4372 > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > Ok yeah, cachestat() holds the rcu read lock, so > workingset_test_recent() can't do a sleepable rstat flush. > > I think the easiest fix would be to flush rstat from the root down > (NULL) in filemap_cachestat(), before the rcu section, and add a flag > to workingset_test_recent() to forego it. Nhat? You're right. I think it's been broken since this commit: b00684722262 mm: workingset: move the stats flush into workingset_test_recent() which moves the stats flushing from the refault step (before rcu read lock section) to inside workingset_test_recent(). I believe that's 6.8, 6.9, and 6.10 we need to fix? The fix sounds reasonable to me :) Let me whip up something real quick.