Re: [PATCH 2/2] mm/vmscan: don't try to reclaim hwpoison folio

Miaohe Lin <linmiaohe@xxxxxxxxxx> · Thu, 20 Mar 2025 10:50:56 +0800

On 2025/3/18 16:39, Jinjiang Tu wrote:
> Syzkaller reports a bug as follows:

Thanks for your fix.

> 
> Injecting memory failure for pfn 0x18b00e at process virtual address 0x20ffd000
> Memory failure: 0x18b00e: dirty swapcache page still referenced by 2 users
> Memory failure: 0x18b00e: recovery action for dirty swapcache page: Failed
> page: refcount:2 mapcount:0 mapping:0000000000000000 index:0x20ffd pfn:0x18b00e
> memcg:ffff0000dd6d9000
> anon flags: 0x5ffffe00482011(locked|dirty|arch_1|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0xfffff)
> raw: 005ffffe00482011 dead000000000100 dead000000000122 ffff0000e232a7c9
> raw: 0000000000020ffd 0000000000000000 00000002ffffffff ffff0000dd6d9000
> page dumped because: VM_BUG_ON_FOLIO(!folio_test_uptodate(folio))
> ------------[ cut here ]------------
> kernel BUG at mm/swap_state.c:184!
> Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
> Modules linked in:
> CPU: 0 PID: 60 Comm: kswapd0 Not tainted 6.6.0-gcb097e7de84e #3
> Hardware name: linux,dummy-virt (DT)
> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : add_to_swap+0xbc/0x158
> lr : add_to_swap+0xbc/0x158
> sp : ffff800087f37340
> x29: ffff800087f37340 x28: fffffc00052c0380 x27: ffff800087f37780
> x26: ffff800087f37490 x25: ffff800087f37c78 x24: ffff800087f377a0
> x23: ffff800087f37c50 x22: 0000000000000000 x21: fffffc00052c03b4
> x20: 0000000000000000 x19: fffffc00052c0380 x18: 0000000000000000
> x17: 296f696c6f662865 x16: 7461646f7470755f x15: 747365745f6f696c
> x14: 6f6621284f494c4f x13: 0000000000000001 x12: ffff600036d8b97b
> x11: 1fffe00036d8b97a x10: ffff600036d8b97a x9 : dfff800000000000
> x8 : 00009fffc9274686 x7 : ffff0001b6c5cbd3 x6 : 0000000000000001
> x5 : ffff0000c25896c0 x4 : 0000000000000000 x3 : 0000000000000000
> x2 : 0000000000000000 x1 : ffff0000c25896c0 x0 : 0000000000000000
> Call trace:
>  add_to_swap+0xbc/0x158
>  shrink_folio_list+0x12ac/0x2648
>  shrink_inactive_list+0x318/0x948
>  shrink_lruvec+0x450/0x720
>  shrink_node_memcgs+0x280/0x4a8
>  shrink_node+0x128/0x978
>  balance_pgdat+0x4f0/0xb20
>  kswapd+0x228/0x438
>  kthread+0x214/0x230
>  ret_from_fork+0x10/0x20
> 

There are too many races in memory_failure to handle...

> I can reproduce this issue with the following steps:
> 1) When a dirty swapcache page is isolated by reclaim process and the page
> isn't locked, inject memory failure for the page. me_swapcache_dirty()
> clears uptodate flag and tries to delete from lru, but fails. Reclaim
> process will put the hwpoisoned page back to lru.

The hwpoisoned page is put back to lru list due to memory_failure holding the extra page refcnt?

> 2) The process that maps the hwpoisoned page exits, the page is deleted
> the page will never be freed and will be in the lru forever.

Again, memory_failure holds the extra page refcnt so...

> 3) If we trigger a reclaim again and tries to reclaim the page,
> add_to_swap() will trigger VM_BUG_ON_FOLIO due to the uptodate flag is
> cleared.
> 
> To fix it, skip the hwpoisoned page in shrink_folio_list(). Besides, the
> hwpoison folio may not be unmapped by hwpoison_user_mappings() yet, unmap
> it in shrink_folio_list(), otherwise the folio will fail to be unmaped
> by hwpoison_user_mappings() since the folio isn't in lru list.
> 
> Signed-off-by: Jinjiang Tu <tujinjiang@xxxxxxxxxx>

Acked-by: Miaohe Lin <linmiaohe@xxxxxxxxxx>

Thanks.
.