On 2024/2/4 09:28, Nhat Pham wrote: > On Sat, Feb 3, 2024 at 12:37 PM syzbot > <syzbot+17a611d10af7d18a7092@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: >> >> Hello, >> >> syzbot found the following issue on: >> >> HEAD commit: 861c0981648f Merge tag 'jfs-6.8-rc3' of github.com:kleikam.. >> git tree: upstream >> console output: https://syzkaller.appspot.com/x/log.txt?x=174537bbe80000 >> kernel config: https://syzkaller.appspot.com/x/.config?x=b168fa511db3ca08 >> dashboard link: https://syzkaller.appspot.com/bug?extid=17a611d10af7d18a7092 >> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 >> userspace arch: i386 >> >> Unfortunately, I don't have any reproducer for this issue yet. >> >> Downloadable assets: >> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-861c0981.raw.xz >> vmlinux: https://storage.googleapis.com/syzbot-assets/b2b204c7b4a0/vmlinux-861c0981.xz >> kernel image: https://storage.googleapis.com/syzbot-assets/170ec316e557/bzImage-861c0981.xz >> >> IMPORTANT: if you fix the issue, please add the following tag to the commit: >> Reported-by: syzbot+17a611d10af7d18a7092@xxxxxxxxxxxxxxxxxxxxxxxxx >> >> kcov_ioctl+0x4f/0x720 kernel/kcov.c:704 >> __do_compat_sys_ioctl+0x2bf/0x330 fs/ioctl.c:971 >> do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline] >> __do_fast_syscall_32+0x79/0x110 arch/x86/entry/common.c:321 >> page has been migrated, last migrate reason: compaction >> ------------[ cut here ]------------ >> WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 folio_lruvec include/linux/memcontrol.h:775 [inline] >> WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 zswap_folio_swapin+0x47d/0x5a0 mm/zswap.c:381 >> Modules linked in: >> CPU: 2 PID: 5104 Comm: syz-fuzzer Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0 >> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 >> RIP: 0010:folio_lruvec include/linux/memcontrol.h:775 [inline] > > Hmm looks like it's this line: > VM_WARN_ON_ONCE_FOLIO(!memcg && !mem_cgroup_disabled(), folio); > > Looks like memcg was cleared from the folio. Haven't looked too > closely yet, but this (and the "page has been migrated" line above) > suggests maybe there is some migration business going on - > mem_cgroup_migrate() clears the old folio's memcg_data (via > old->memcg_data = 0). Yeah, I think it's this case. > > Here's my theory (which could be wrong - someone please fact-check > me): swap_read_folio(), which precedes zswap_folio_swapin(), unlocks And another case is !page_allocated, the returned folio is unlocked, right? > the folio. Could this be sufficient to allow for migration? If this is IMHO, folio locked is sufficient to avoid concurrent memcg migration. > the case, all we need to do is move this to above swap_read_folio(), > while the folio is still locked. __read_swap_cache_async() already > charges the folio to an memcg, so no need to wait till after > swap_read_page() anyway. Should we call zswap_folio_swapin() in the !page_allocated case? Thanks.