-----Original Message----- From: Matthew Wilcox <willy@xxxxxxxxxxxxx> Sent: Friday, October 21, 2022 3:32 PM To: Pulavarty, Badari <badari.pulavarty@xxxxxxxxx> Cc: david@xxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; bfoster@xxxxxxxxxx; huangzhaoyang@xxxxxxxxx; ke.wang@xxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx; inux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; zhaoyang.huang@xxxxxxxxxx; Shutemov, Kirill <kirill.shutemov@xxxxxxxxx>; Tang, Feng <feng.tang@xxxxxxxxx>; Huang, Ying <ying.huang@xxxxxxxxx>; Yin, Fengwei <fengwei.yin@xxxxxxxxx>; Hansen, Dave <dave.hansen@xxxxxxxxx>; Zanussi, Tom <tom.zanussi@xxxxxxxxx> Subject: Re: [RFC PATCH] mm: move xa forward when run across zombie page On Fri, Oct 21, 2022 at 09:37:36PM +0000, Pulavarty, Badari wrote: > I have been tracking similar issue(s) with soft lockup or panics on my system consistently with my workload. > Tried multiple kernel versions. Issue seem to happen consistently on > 6.1-rc1 (while it seem to happen on 5.17, 5.19, 6.0.X) > > PANIC: "Kernel panic - not syncing: softlockup: hung tasks" > > RIP: 0000000000000001 RSP: ff3d8e7f0d9978ea RFLAGS: ff3d8e7f0d9978e8 > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 > RDX: 000000006b9c66f1 RSI: ff506ca15ff33c20 RDI: 0000000000000000 > RBP: ffffffff84bc64cc R8: ff3d8e412cabdff0 R9: ffffffff84c00e8b > R10: ff506ca15ff33b69 R11: 0000000000000000 R12: ff506ca15ff33b58 > R13: ffffffff84bc79a3 R14: ff506ca15ff33b38 R15: 0000000000000000 > ORIG_RAX: ff506ca15ff33a80 CS: ff506ca15ff33c78 SS: 0000 > #9 [ff506ca15ff33c18] xas_load at ffffffff84b49a7f > #10 [ff506ca15ff33c28] __filemap_get_folio at ffffffff840985da > #11 [ff506ca15ff33ce8] swap_cache_get_folio at ffffffff841119db Oh, this is interesting. It's the swapper address_space. I bet that 0xffffffff85044560 (the value of a_ops) is the address of swap_ops in your kernel? I don't know if it will help, but it's an interesting data point. > Looking at the crash dump, mapping->host became NULL. Not sure what exactly is happening. That's always true for the swapper_spaces, AIUI. > a_ops = 0xffffffff85044560, Correct. Its swap_ops. (I am using zswap). In my scenario - I run the workload in a container and use DAMON or PSI to squeeze the cold pages out to zswap. Thanks, Badari