RE: [RFC PATCH] mm: move xa forward when run across zombie page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




-----Original Message-----
From: Matthew Wilcox <willy@xxxxxxxxxxxxx> 
Sent: Friday, October 21, 2022 3:32 PM
To: Pulavarty, Badari <badari.pulavarty@xxxxxxxxx>
Cc: david@xxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; bfoster@xxxxxxxxxx; huangzhaoyang@xxxxxxxxx; ke.wang@xxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx; inux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; zhaoyang.huang@xxxxxxxxxx; Shutemov, Kirill <kirill.shutemov@xxxxxxxxx>; Tang, Feng <feng.tang@xxxxxxxxx>; Huang, Ying <ying.huang@xxxxxxxxx>; Yin, Fengwei <fengwei.yin@xxxxxxxxx>; Hansen, Dave <dave.hansen@xxxxxxxxx>; Zanussi, Tom <tom.zanussi@xxxxxxxxx>
Subject: Re: [RFC PATCH] mm: move xa forward when run across zombie page

On Fri, Oct 21, 2022 at 09:37:36PM +0000, Pulavarty, Badari wrote:
> I have been tracking similar issue(s) with soft lockup or panics on my system consistently with my workload.
> Tried multiple kernel versions. Issue seem to happen consistently on 
> 6.1-rc1 (while it seem to happen on 5.17, 5.19, 6.0.X)
> 
> PANIC: "Kernel panic - not syncing: softlockup: hung tasks"
> 
>     RIP: 0000000000000001  RSP: ff3d8e7f0d9978ea  RFLAGS: ff3d8e7f0d9978e8
>     RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000000
>     RDX: 000000006b9c66f1  RSI: ff506ca15ff33c20  RDI: 0000000000000000
>     RBP: ffffffff84bc64cc   R8: ff3d8e412cabdff0   R9: ffffffff84c00e8b
>     R10: ff506ca15ff33b69  R11: 0000000000000000  R12: ff506ca15ff33b58
>     R13: ffffffff84bc79a3  R14: ff506ca15ff33b38  R15: 0000000000000000
>     ORIG_RAX: ff506ca15ff33a80  CS: ff506ca15ff33c78  SS: 0000
> #9 [ff506ca15ff33c18] xas_load at ffffffff84b49a7f
> #10 [ff506ca15ff33c28] __filemap_get_folio at ffffffff840985da
> #11 [ff506ca15ff33ce8] swap_cache_get_folio at ffffffff841119db

Oh, this is interesting.  It's the swapper address_space.
I bet that 0xffffffff85044560 (the value of a_ops) is the address of swap_ops in your kernel?

I don't know if it will help, but it's an interesting data point.

> Looking at the crash dump, mapping->host became NULL. Not sure what exactly is happening.

That's always true for the swapper_spaces, AIUI.

>   a_ops = 0xffffffff85044560,

Correct. Its swap_ops. (I am using zswap).

In my scenario - I run the workload in a container and use DAMON or PSI to squeeze the cold pages out to zswap.

Thanks,
Badari






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux