On 2/9/22 23:31, Aneesh Kumar K.V wrote: > This fixes the below crash: > > kernel BUG at include/linux/mm.h:2373! > cpu 0x5d: Vector: 700 (Program Check) at [c00000003c6e76e0] > pc: c000000000581a54: pmd_to_page+0x54/0x80 > lr: c00000000058d184: move_hugetlb_page_tables+0x4e4/0x5b0 > sp: c00000003c6e7980 > msr: 9000000000029033 > current = 0xc00000003bd8d980 > paca = 0xc000200fff610100 irqmask: 0x03 irq_happened: 0x01 > pid = 9349, comm = hugepage-mremap > kernel BUG at include/linux/mm.h:2373! > [link register ] c00000000058d184 move_hugetlb_page_tables+0x4e4/0x5b0 > [c00000003c6e7980] c00000000058cecc move_hugetlb_page_tables+0x22c/0x5b0 (unreliable) > [c00000003c6e7a90] c00000000053b78c move_page_tables+0xdbc/0x1010 > [c00000003c6e7bd0] c00000000053bc34 move_vma+0x254/0x5f0 > [c00000003c6e7c90] c00000000053c790 sys_mremap+0x7c0/0x900 > [c00000003c6e7db0] c00000000002c450 system_call_exception+0x160/0x2c0 > > the kernel can't use huge_pte_offset before it set the pte entry because a page table > lookup check for huge PTE bit in the page table to differentiate between a > huge pte entry and a pointer to pte page. A huge_pte_alloc won't mark the > page table entry huge and hence kernel should not use huge_pte_offset after > a huge_pte_alloc. Thanks Aneesh! Architectures that use the default version of huge_pte_offset (like X86) 'got away' with this because of the default return: pmd = pmd_offset(pud, addr); /* must be pmd huge, non-present or none */ return (pte_t *)pmd; > > Cc: Mina Almasry <almasrymina@xxxxxxxxxx> > Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx> Should we add a Fixes: tag and cc stable? > --- > mm/hugetlb.c | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> -- Mike Kravetz > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 61895cc01d09..e57650a9404f 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -4851,14 +4851,13 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, > } > > static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr, > - unsigned long new_addr, pte_t *src_pte) > + unsigned long new_addr, pte_t *src_pte, pte_t *dst_pte) > { > struct hstate *h = hstate_vma(vma); > struct mm_struct *mm = vma->vm_mm; > - pte_t *dst_pte, pte; > spinlock_t *src_ptl, *dst_ptl; > + pte_t pte; > > - dst_pte = huge_pte_offset(mm, new_addr, huge_page_size(h)); > dst_ptl = huge_pte_lock(h, mm, dst_pte); > src_ptl = huge_pte_lockptr(h, mm, src_pte); > > @@ -4917,7 +4916,7 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, > if (!dst_pte) > break; > > - move_huge_pte(vma, old_addr, new_addr, src_pte); > + move_huge_pte(vma, old_addr, new_addr, src_pte, dst_pte); > } > flush_tlb_range(vma, old_end - len, old_end); > mmu_notifier_invalidate_range_end(&range);