xialonglong <xialonglong1@xxxxxxxxxx> writes: > Thank you very much for your reply :) > Inspired by your replywe successfully reproduced the bug. > > The test steps: > 1.swapon /dev/zram0 > 2.add some memory pressure by stress-ng > 3.calling swapoff /dev/zram0 in the do_swap_page function (this > changed the source code) > 4.bug occured in the same place. > > After testing, this patch solves the bug. > Finally, there is a small question. Why linux5.10 revert this patch > (2799e77529c2)? 2799e77529c2 is merged by v5.14. Best Regards, Huang, Ying > We found that to fix this bug, the following patches may be required: > efa33fc7f6e mm/shmem: fix shmem_swapin() race with swapoff > 5c046235a826 mm/swap: remove confusing checking for non_swap_entry() > in swap_ra_info() > 2799e77529c2 swap: fix do_swap_page() race with swapoff > 63d8620ecf93 mm/swapfile: use percpu_ref to serialize against > concurrent swapoff > seem like all this patchset is needed except commit 5c046235a826 > ("mm/swap: remove confusing checking for non_swap_entry() in > swap_ra_info()") > > Best Regards, > Xia, longlong > > ( 2022/11/28 9:08, Huang, Ying S: >> Hi, >> >> xialonglong <xialonglong1@xxxxxxxxxx> writes: >> >>> A panic occur in the linux 5.10we meet it only onceit seems that >>> there is no special changes between 5.10 and upsteam about swap_cgroup. >>> >>> The test is based on QEMU with 64GB memory, one 2GB zram device as >>> swap area. >>> The test steps: >>> 1.swapoff -a >>> 2.add some memory pressure by stress-ng >>> 3.while (2 minutes) { >>> swapoff /dev/zram0 >>> swapon /dev/zram0 >>> sleep 3 >>> } >>> 4. swapon -a >>> >>> Preliminary analysis showed that the swap entry point to a swap area >>> which have already been swapoff, and no other obvious clues, still >>> trying to reproduce it. >> We have a patch as follows to fix a similar issue, >> >> 2799e77529c2a25492a4395db93996e3dacd762d >> Author: Miaohe Lin <linmiaohe@xxxxxxxxxx> >> AuthorDate: Mon Jun 28 19:36:50 2021 -0700 >> Commit: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> >> CommitDate: Tue Jun 29 10:53:49 2021 -0700 >> >> swap: fix do_swap_page() race with swapoff >> >> When I was investigating the swap code, I found the below possible race >> window: >> >> CPU 1 CPU 2 >> ----- ----- >> do_swap_page >> if (data_race(si->flags & SWP_SYNCHRONOUS_IO) >> swap_readpage >> if (data_race(sis->flags & SWP_FS_OPS)) { >> swapoff >> .. >> p->swap_file = NULL; >> .. >> struct file *swap_file = sis->swap_file; >> struct address_space *mapping = swap_file->f_mapping;[oops!] >> >> Note that for the pages that are swapped in through swap cache, this isn't >> an issue. Because the page is locked, and the swap entry will be marked >> with SWAP_HAS_CACHE, so swapoff() can not proceed until the page has been >> unlocked. >> >> Fix this race by using get/put_swap_device() to guard against concurrent >> swapoff. >> >> Can you check whether that can fix your issue? >> >> Best Regards, >> Huang, Ying >> >>> Any known issue about this feature, or any advise will be appreciated. >>> >>> Here are the panic log, >>> >>> Unable to handle kernel NULL pointer dereference at virtual address >>> 0000000000000740 >>> Mem abort info: >>> ESR = 0x96000004 >>> EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, >>> S1PTW = 0 Data abort info: >>> ISV = 0, ISS = 0x00000004 >>> CM = 0, WnR = 0 >>> user pgtable: 4k pages, 48-bit VAs, pgdp=000000010ae6e000 >>> pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: >>> 96000004 [#1] SMP Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 >>> 02/06/2015 >>> pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--) >>> pc : lookup_swap_cgroup_id+0x38/0x50 >>> lr : mem_cgroup_charge+0x9c/0x424 >>> sp : ffff800102f63bc0 >>> x29: ffff800102f63bc0 x28: ffff0000d0d64d00 >>> x27: 0000000000000000 x26: 0000000000000007 >>> x25: ffff0000018c86a8 x24: ffff0000018c8640 >>> x23: 0000000000000cc0 x22: 0000000000000001 >>> x21: 0000000000000001 x20: ffff800102f63d28 >>> x19: fffffe000373cb40 x18: 0000000000000000 >>> x17: 0000000000000000 x16: ffff8001004715a4 >>> x15: 00000000ffffffff x14: 0000000000003000 >>> x13: 00000000ffffffff x12: 0000000000000040 >>> x11: ffff0000c0403478 x10: ffff0000c040347a >>> x9 : ffff8001003e957c x8 : 000000000009dddd >>> x7 : 0000000000000600 x6 : 00000000000000e8 >>> x5 : 0000020000200000 x4 : ffff000000000000 >>> x3 : ffff800101f4c030 x2 : 0000000000000000 >>> x1 : 00000000000001e4 x0 : 0000000000000000 >>> >>> Call trace: >>> lookup_swap_cgroup_id+0x38/0x50 >>> do_swap_page+0xa64/0xc04 >>> handle_pte_fault+0x1c8/0x214 >>> __handle_mm_fault+0x1b0/0x380 >>> handle_mm_fault+0xf4/0x284 >>> do_page_fault+0x188/0x474 >>> do_translation_fault+0xb8/0xe4 >>> do_mem_abort+0x48/0xb0 >>> el0_da+0x44/0x80 >>> el0_sync_handler+0x88/0xb4 >>> el0_sync+0x160/0x180 >>> >>> <lookup_swap_cgroup_id>:?????? mov?? x9, x30 >>> <lookup_swap_cgroup_id+0x4>:???? nop >>> <lookup_swap_cgroup_id+0x8>:???? >>> lsr?? x2, x0, #58 SWP_TYPE_SHIFT == 58? x2 = >>> swp_type >>> <lookup_swap_cgroup_id+0xc>:???? >>> adrp? x1, 0xffff800101f4c000 >>> <memcg_sockets_enabled_key+0x8> >>> <lookup_swap_cgroup_id+0x10>:??? >>> add?? x3, x1, #0x30???? >>> x3 == swap_cgroup_ctrl >>> <lookup_swap_cgroup_id+0x14>:??? ubfx? x6, x0, #11, #47 >>> <lookup_swap_cgroup_id+0x18>:??? add?? x2, x2, x2, lsl #1 >>> <lookup_swap_cgroup_id+0x1c>:??? ubfiz? x1, x0, #1, #11 >>> <lookup_swap_cgroup_id+0x20>:??? >>> mov?? x5, >>> #0x200000????????? >>> // #2097152 >>> <lookup_swap_cgroup_id+0x24>:??? >>> mov?? x4, >>> #0xffff000000000000???? // >>> #-281474976710656 >>> <lookup_swap_cgroup_id+0x28>:??? movk? x5, #0x200, lsl #32 >>> <lookup_swap_cgroup_id+0x2c>:??? hint? #0x19 >>> <lookup_swap_cgroup_id+0x30>:??? >>> ldr?? x0, [x3,x2,lsl #3] x3=ffff800101f4c030, x0 = 0 >>> <lookup_swap_cgroup_id+0x34>:??? hint? #0x1d >>> <lookup_swap_cgroup_id+0x38>:??? >>> ldr?? x0, [x0,x6,lsl #3] x0 = 0 + 0xe8 * 8 == 0x740 >>> <lookup_swap_cgroup_id+0x3c>:??? add?? x0, x0, x5 >>> <lookup_swap_cgroup_id+0x40>:??? lsr?? x0, x0, #6 >>> <lookup_swap_cgroup_id+0x44>:??? add?? x0, x1, x0, lsl #12 >>> <lookup_swap_cgroup_id+0x48>:??? ldrh? w0, [x0,x4] >>> <lookup_swap_cgroup_id+0x4c>:??? ret