On Fri, Sep 06, 2024 at 02:52:38PM +0800, Lai, Yi wrote: Hi Yi, > > I used Syzkaller and found that there is task hang in soft_offline_page in Linux-next tree - next-20240902. I don't know if it is related, but we had a fix for this commit for a ltp failure due to locking issues that is there in next-20240905 but not in next-20240902. Fix: https://lore.kernel.org/linux-next/20240902124931.506061-2-kernel@xxxxxxxxxxxxxxxx/ Is this reproducible also on next-20240905? > > After bisection and the first bad commit is: > " > fd031210c9ce mm: split a folio in minimum folio order chunks > " > > All detailed into can be found at: > https://github.com/laifryiee/syzkaller_logs/tree/main/240904_155526_soft_offline_page > Syzkaller repro code: > https://github.com/laifryiee/syzkaller_logs/tree/main/240904_155526_soft_offline_page/repro.c > Syzkaller repro syscall steps: > https://github.com/laifryiee/syzkaller_logs/tree/main/240904_155526_soft_offline_page/repro.prog > Syzkaller report: > https://github.com/laifryiee/syzkaller_logs/tree/main/240904_155526_soft_offline_page/repro.report > Kconfig(make olddefconfig): > https://github.com/laifryiee/syzkaller_logs/tree/main/240904_155526_soft_offline_page/kconfig_origin > Bisect info: > https://github.com/laifryiee/syzkaller_logs/tree/main/240904_155526_soft_offline_page/bisect_info.log > bzImage: > https://github.com/laifryiee/syzkaller_logs/raw/f633dcbc3a8e4ca5f52f0110bc75ff17d9885db4/240904_155526_soft_offline_page/bzImage_ecc768a84f0b8e631986f9ade3118fa37852fef0 > Issue dmesg: > https://github.com/laifryiee/syzkaller_logs/blob/main/240904_155526_soft_offline_page/ecc768a84f0b8e631986f9ade3118fa37852fef0_dmesg.log > > " > [ 447.976688] ? __pfx_soft_offline_page.part.0+0x10/0x10 > [ 447.977255] ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20 > [ 447.977858] soft_offline_page+0x97/0xc0 > [ 447.978281] do_madvise.part.0+0x1a45/0x2a30 > [ 447.978742] ? __pfx___lock_acquire+0x10/0x10 > [ 447.979227] ? __pfx_do_madvise.part.0+0x10/0x10 > [ 447.979716] ? __this_cpu_preempt_check+0x21/0x30 > [ 447.980225] ? __this_cpu_preempt_check+0x21/0x30 > [ 447.980729] ? lock_release+0x441/0x870 > [ 447.981160] ? __this_cpu_preempt_check+0x21/0x30 > [ 447.981656] ? seqcount_lockdep_reader_access.constprop.0+0xb4/0xd0 > [ 447.982321] ? lockdep_hardirqs_on+0x89/0x110 > [ 447.982771] ? trace_hardirqs_on+0x51/0x60 > [ 447.983191] ? seqcount_lockdep_reader_access.constprop.0+0xc0/0xd0 > [ 447.983819] ? __sanitizer_cov_trace_cmp4+0x1a/0x20 > [ 447.984282] ? ktime_get_coarse_real_ts64+0xbf/0xf0 > [ 447.984673] __x64_sys_madvise+0x139/0x180 > [ 447.984997] x64_sys_call+0x19a5/0x2140 > [ 447.985307] do_syscall_64+0x6d/0x140 > [ 447.985600] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 447.986011] RIP: 0033:0x7f782623ee5d > [ 447.986248] RSP: 002b:00007fff9ddaffb8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c > [ 447.986709] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f782623ee5d > [ 447.987147] RDX: 0000000000000065 RSI: 0000000000003000 RDI: 0000000020d51000 > [ 447.987584] RBP: 00007fff9ddaffc0 R08: 00007fff9ddafff0 R09: 00007fff9ddafff0 > [ 447.988022] R10: 00007fff9ddafff0 R11: 0000000000000217 R12: 00007fff9ddb0118 > [ 447.988428] R13: 0000000000401716 R14: 0000000000403e08 R15: 00007f782645d000 > [ 447.988799] </TASK> > [ 447.988921] > [ 447.988921] Showing all locks held in the system: > [ 447.989237] 1 lock held by khungtaskd/33: > [ 447.989447] #0: ffffffff8705c500 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x73/0x3c0 > [ 447.989947] 1 lock held by repro/628: > [ 447.990144] #0: ffffffff87258a28 (mf_mutex){+.+.}-{3:3}, at: soft_offline_page.part.0+0xda/0xf40 > [ 447.990611] > [ 447.990701] ============================================= > > " > > I hope you find it useful. > > Regards, > Yi Lai >