On Sat, 10 Jul 2021 09:33:26 +0200 Igor Raits wrote: >Hello, > >I've seen one weird bug on 5.12.14 that happened a couple of times when I >started a bunch of VMs on a server. Thanks for your report. > >I've briefly googled this problem but could not find any relevant commit >that would fix this issue. Not sure this is the first report - a similar one [0] from syzbot. [0] https://lore.kernel.org/linux-mm/00000000000045ff9505c1cfc9ae@xxxxxxxxxx/ > >Do you have any hint how to debug this further or know the fix by any >chance? This report has more info about the BUG - in pmd_migration_entry_wait() huge migration entry is checked under page table lock. And on the updater side, hme should be set and removed also with ptl held, see below diff. > >Thanks in advance. Stack trace following: > >[ 376.876610] ------------[ cut here ]------------ >[ 376.881274] kernel BUG at include/linux/swapops.h:204! >[ 376.886455] invalid opcode: 0000 [#1] SMP NOPTI >[ 376.891014] CPU: 40 PID: 11775 Comm: rpc-worker Tainted: G E > 5.12.14-1.gdc.el8.x86_64 #1 >[ 376.900464] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 >Gen10, BIOS U30 05/24/2021 >[ 376.909038] RIP: 0010:pmd_migration_entry_wait+0x132/0x140 >[ 376.914562] Code: 02 00 00 00 5b 4c 89 c7 5d e9 8a e4 f6 ff 48 81 e2 00 >f0 ff ff 48 f7 d2 48 21 c2 89 d1 f7 c2 81 01 00 00 75 80 e9 44 ff ff ff ><0f> 0b 48 8b 2d 75 bd 30 01 e9 ef fe ff ff 0f 1f 44 00 00 41 55 48 >[ 376.933443] RSP: 0000:ffffb65a5e1cfdc8 EFLAGS: 00010246 >[ 376.938701] RAX: 0017ffffc0000000 RBX: ffff908b8ecabaf8 RCX: >ffffffffffffffff >[ 376.945878] RDX: 0000000000000000 RSI: ffff908b8ecabaf8 RDI: >fffff497473b2ae8 >[ 376.953055] RBP: fffff497473b2ae8 R08: fffff49747fa8080 R09: >0000000000000000 >[ 376.960230] R10: 0000000000000000 R11: 0000000000000000 R12: >0000000000000af8 >[ 376.967407] R13: 0400000000000000 R14: 0400000000000080 R15: >ffff908bbef7b6a8 >[ 376.974582] FS: 00007f5bb1f81700(0000) GS:ffff90e87fd80000(0000) >knlGS:0000000000000000 >[ 376.982718] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >[ 376.988497] CR2: 00007f5b2bfffd98 CR3: 00000001f793e006 CR4: >00000000007726e0 >[ 376.995673] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >0000000000000000 >[ 377.002849] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >0000000000000400 >[ 377.010026] PKRU: 55555554 >[ 377.012745] Call Trace: >[ 377.015207] __handle_mm_fault+0x5ad/0x6e0 >[ 377.019335] handle_mm_fault+0xc5/0x290 >[ 377.023194] do_user_addr_fault+0x1cd/0x740 >[ 377.027406] exc_page_fault+0x54/0x110 >[ 377.031182] ? asm_exc_page_fault+0x8/0x30 >[ 377.035307] asm_exc_page_fault+0x1e/0x30 +++ x/mm/huge_memory.c @@ -2983,6 +2983,7 @@ void set_pmd_migration_entry(struct page struct vm_area_struct *vma = pvmw->vma; struct mm_struct *mm = vma->vm_mm; unsigned long address = pvmw->address; + spinlock_t *ptl; pmd_t pmdval; swp_entry_t entry; pmd_t pmdswp; @@ -2998,7 +2999,9 @@ void set_pmd_migration_entry(struct page pmdswp = swp_entry_to_pmd(entry); if (pmd_soft_dirty(pmdval)) pmdswp = pmd_swp_mksoft_dirty(pmdswp); + ptl = pmd_lock(mm, pvmw->pmd); set_pmd_at(mm, address, pvmw->pmd, pmdswp); + spin_unlock(ptl); page_remove_rmap(page, true); put_page(page); } @@ -3009,6 +3012,7 @@ void remove_migration_pmd(struct page_vm struct mm_struct *mm = vma->vm_mm; unsigned long address = pvmw->address; unsigned long mmun_start = address & HPAGE_PMD_MASK; + spinlock_t *ptl; pmd_t pmde; swp_entry_t entry; @@ -3028,7 +3032,9 @@ void remove_migration_pmd(struct page_vm page_add_anon_rmap(new, vma, mmun_start, true); else page_add_file_rmap(new, true); + ptl = pmd_lock(mm, pvmw->pmd); set_pmd_at(mm, mmun_start, pvmw->pmd, pmde); + spin_unlock(ptl); if ((vma->vm_flags & VM_LOCKED) && !PageDoubleMap(new)) mlock_vma_page(new); update_mmu_cache_pmd(vma, address, pvmw->pmd);