On 27/10/2024 17:26, Kairui Song wrote: > Hi Usama, > >> >> Hi Kairui, >> >> I was testing zswap writeback in mm-unstable, and I think this patch might be breaking things. >> >> I have added the panic below >> >> 130.051024] ------------[ cut here ]------------ >> [ 130.051489] kernel BUG at mm/list_lru.c:321! >> [ 130.051732] Oops: invalid opcode: 0000 [#1] SMP >> [ 130.052133] CPU: 1 UID: 0 PID: 4976 Comm: cc1 Not tainted 6.12.0-rc1-00084-g278bd01cdaf1 #276 >> [ 130.052595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.el9 04/01/2014 >> [ 130.053276] RIP: 0010:__list_lru_walk_one+0x1ae/0x1b0 >> [ 130.053983] Code: 7c 24 78 00 74 03 fb eb 00 48 89 d8 48 83 c4 40 5b 41 5c 41 5d 41 5e 41 5f 5d c3 41 c6 07 00 eb e8 41 c6 07 00 fb eb e1 0f 0b <0f> 0b 0f 1f 44 00 00 6a 01 e8 44 fe ff ff 48 83 c4 08 c3 66 2e 0f >> [ 130.055557] RSP: 0000:ffffc90004a2b9a0 EFLAGS: 00010246 >> [ 130.056084] RAX: ffff88805dedf6e8 RBX: 0000000000000071 RCX: 0000000000000005 >> [ 130.057407] RDX: 0000000000000000 RSI: 0000000000000022 RDI: ffff888008a26400 >> [ 130.057794] RBP: ffff88805dedf6d0 R08: 0000000000000402 R09: 0000000000000001 >> [ 130.058579] R10: ffffc90004a2b7e8 R11: 0000000000000000 R12: ffffffff81342930 >> [ 130.058962] R13: ffff888017532ca0 R14: ffffc90004a2bae8 R15: ffff8880175322c8 >> [ 130.059773] FS: 00007ff3f1e21f00(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 >> [ 130.060242] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 130.060563] CR2: 00007f428e2e2ed8 CR3: 0000000067db6001 CR4: 0000000000770ef0 >> [ 130.060952] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> [ 130.061658] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> [ 130.062425] PKRU: 55555554 >> [ 130.062578] Call Trace: >> [ 130.062720] <TASK> >> [ 130.062941] ? __die_body+0x66/0xb0 >> [ 130.063145] ? die+0x88/0xb0 >> [ 130.063309] ? do_trap+0x9d/0x170 >> [ 130.063499] ? __list_lru_walk_one+0x1ae/0x1b0 >> [ 130.063745] ? __list_lru_walk_one+0x1ae/0x1b0 >> [ 130.063995] ? handle_invalid_op+0x65/0x80 >> [ 130.064223] ? __list_lru_walk_one+0x1ae/0x1b0 >> [ 130.064467] ? exc_invalid_op+0x2f/0x40 >> [ 130.064681] ? asm_exc_invalid_op+0x16/0x20 >> [ 130.064912] ? zswap_shrinker_count+0x1c0/0x1c0 >> [ 130.065172] ? __list_lru_walk_one+0x1ae/0x1b0 >> [ 130.065417] list_lru_walk_one+0xc/0x20 >> [ 130.065630] zswap_shrinker_scan+0x4b/0x80 >> [ 130.065856] do_shrink_slab+0x15f/0x2f0 >> [ 130.066075] shrink_slab+0x2bf/0x3d0 >> [ 130.066276] shrink_node+0x4f0/0x8a0 >> [ 130.066477] do_try_to_free_pages+0x131/0x4d0 >> [ 130.066717] try_to_free_mem_cgroup_pages+0x143/0x220 >> [ 130.067000] try_charge_memcg+0x22a/0x610 >> [ 130.067224] __mem_cgroup_charge+0x74/0x100 >> [ 130.068060] do_pte_missing+0xaa8/0x1020 >> [ 130.068280] handle_mm_fault+0x75d/0x1120 >> [ 130.068502] do_user_addr_fault+0x1c2/0x6f0 >> [ 130.068802] exc_page_fault+0x4f/0xb0 >> [ 130.069014] asm_exc_page_fault+0x22/0x30 >> [ 130.069240] RIP: 0033:0x7ff3f19ede49 >> [ 130.069441] Code: c9 62 e1 7f 29 7f 00 c3 66 0f 1f 84 00 00 00 00 00 40 0f b6 c6 48 89 d1 48 89 fa f3 aa 48 89 d0 c3 48 3b 15 c9 a3 06 00 77 e7 <62> e1 fe 28 7f 07 62 e1 fe 28 7f 47 01 48 81 fa 80 00 00 00 76 89 >> [ 130.070477] RSP: 002b:00007ffc5c818078 EFLAGS: 00010283 >> [ 130.070830] RAX: 00007ff3efac9000 RBX: 00007ff3f02d1940 RCX: 0000000000000001 >> [ 130.071522] RDX: 00000000000005a8 RSI: 0000000000000000 RDI: 00007ff3efac9000 >> [ 130.072146] RBP: 00007ffc5c8180c0 R08: 0000000003007320 R09: 0000000000000007 >> [ 130.072594] R10: 0000000003007320 R11: 0000000000000012 R12: 00007ff3f1f0e000 >> [ 130.072981] R13: 000000007ffa1e74 R14: 00000000000005a8 R15: 00000000000000b5 >> [ 130.073369] </TASK> >> [ 130.073496] Modules linked in: >> [ 130.073701] ---[ end trace 0000000000000000 ]--- >> [ 130.073960] RIP: 0010:__list_lru_walk_one+0x1ae/0x1b0 >> [ 130.074319] Code: 7c 24 78 00 74 03 fb eb 00 48 89 d8 48 83 c4 40 5b 41 5c 41 5d 41 5e 41 5f 5d c3 41 c6 07 00 eb e8 41 c6 07 00 fb eb e1 0f 0b <0f> 0b 0f 1f 44 00 00 6a 01 e8 44 fe ff ff 48 83 c4 08 c3 66 2e 0f >> [ 130.075564] RSP: 0000:ffffc90004a2b9a0 EFLAGS: 00010246 >> [ 130.075897] RAX: ffff88805dedf6e8 RBX: 0000000000000071 RCX: 0000000000000005 >> [ 130.076342] RDX: 0000000000000000 RSI: 0000000000000022 RDI: ffff888008a26400 >> [ 130.076739] RBP: ffff88805dedf6d0 R08: 0000000000000402 R09: 0000000000000001 >> [ 130.077192] R10: ffffc90004a2b7e8 R11: 0000000000000000 R12: ffffffff81342930 >> [ 130.077739] R13: ffff888017532ca0 R14: ffffc90004a2bae8 R15: ffff8880175322c8 >> [ 130.078149] FS: 00007ff3f1e21f00(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 >> [ 130.078764] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 130.079095] CR2: 00007f428e2e2ed8 CR3: 0000000067db6001 CR4: 0000000000770ef0 >> [ 130.079521] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> [ 130.080009] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> [ 130.080402] PKRU: 55555554 >> [ 130.080713] Kernel panic - not syncing: Fatal exception >> [ 130.081198] Kernel Offset: disabled >> [ 130.081396] ---[ end Kernel panic - not syncing: Fatal exception ]--- >> >> Thanks, >> Usama >> > > Thanks for the report. I converted list_lru_walk callback to keep the > list unlocked when LRU_RETRY and LRU_REMOVED_RETRY is returned, but > didn't notice shrink_memcg_cg in zswap.c could return LRU_STOP after > it unlocked the list. > > The fix should be simple, is it easy to reproduce? Can you help verify? > > diff --git a/mm/list_lru.c b/mm/list_lru.c > index 79c2d21504a2..1a3caf4c4e14 100644 > --- a/mm/list_lru.c > +++ b/mm/list_lru.c > @@ -298,9 +298,9 @@ __list_lru_walk_one(struct list_lru *lru, int nid, > struct mem_cgroup *memcg, > ret = isolate(item, l, cb_arg); > switch (ret) { > /* > - * LRU_RETRY and LRU_REMOVED_RETRY will drop the lru lock, > - * the list traversal will be invalid and have to restart from > - * scratch. > + * LRU_RETRY, LRU_REMOVED_RETRY and LRU_STOP will drop the lru > + * lock, the list traversal will be invalid and have to restart > + * from scratch. > */ > case LRU_RETRY: > goto restart; > @@ -318,14 +318,13 @@ __list_lru_walk_one(struct list_lru *lru, int > nid, struct mem_cgroup *memcg, > case LRU_SKIP: > break; > case LRU_STOP: > - assert_spin_locked(&l->lock); > goto out; > default: > BUG(); > } > } > -out: > unlock_list_lru(l, irq_off); > +out: > return isolated; > } Hi Kairui, With this fix there are no more crashes. Thanks for the quick fix. Just FYI, to test it, just enable zswap and zswap shrinker (echo Y > /sys/module/zswap/parameters/shrinker_enabled) and build the kernel in a memory constrained environment (memory.max 1G). Thanks, Usama