On Tue, 18 Jul 2023, Aneesh Kumar K.V wrote: > Hugh Dickins <hughd@xxxxxxxxxx> writes: > > > Instead of pte_lockptr(), use the recently added pte_offset_map_nolock() > > in assert_pte_locked(). BUG if pte_offset_map_nolock() fails: this is > > stricter than the previous implementation, which skipped when pmd_none() > > (with a comment on khugepaged collapse transitions): but wouldn't we want > > to know, if an assert_pte_locked() caller can be racing such transitions? > > > > The reason we had that pmd_none check there was to handle khugpaged. In > case of khugepaged we do pmdp_collapse_flush and then do a ptep_clear. > ppc64 had the assert_pte_locked check inside that ptep_clear. > > _pmd = pmdp_collapse_flush(vma, address, pmd); > .. > ptep_clear() > -> asset_ptep_locked() > ---> pmd_none > -----> BUG > > > The problem is how assert_pte_locked() verify whether we are holding > ptl. It does that by walking the page table again and in this specific > case by the time we call the function we already had cleared pmd . Aneesh, please clarify, I've spent hours on this. >From all your use of past tense ("had"), I thought you were Acking my patch; but now, after looking again at v3.11 source and today's, I think you are NAKing my patch in its present form. You are pointing out that anon THP's __collapse_huge_page_copy_succeeded() uses ptep_clear() at a point after pmdp_collapse_flush() already cleared *pmd, so my patch now leads that one use of assert_pte_locked() to BUG. Is that your point? I can easily restore that khugepaged comment (which had appeared to me out of date at the time, but now looks still relevant) and pmd_none(*pmd) check: but please clarify. Thanks, Hugh > > > > This mod might cause new crashes: which either expose my ignorance, or > > indicate issues to be fixed, or limit the usage of assert_pte_locked(). > > > > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> > > --- > > arch/powerpc/mm/pgtable.c | 16 ++++++---------- > > 1 file changed, 6 insertions(+), 10 deletions(-) > > > > diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c > > index cb2dcdb18f8e..16b061af86d7 100644 > > --- a/arch/powerpc/mm/pgtable.c > > +++ b/arch/powerpc/mm/pgtable.c > > @@ -311,6 +311,8 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr) > > p4d_t *p4d; > > pud_t *pud; > > pmd_t *pmd; > > + pte_t *pte; > > + spinlock_t *ptl; > > > > if (mm == &init_mm) > > return; > > @@ -321,16 +323,10 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr) > > pud = pud_offset(p4d, addr); > > BUG_ON(pud_none(*pud)); > > pmd = pmd_offset(pud, addr); > > - /* > > - * khugepaged to collapse normal pages to hugepage, first set > > - * pmd to none to force page fault/gup to take mmap_lock. After > > - * pmd is set to none, we do a pte_clear which does this assertion > > - * so if we find pmd none, return. > > - */ > > - if (pmd_none(*pmd)) > > - return; > > - BUG_ON(!pmd_present(*pmd)); > > - assert_spin_locked(pte_lockptr(mm, pmd)); > > + pte = pte_offset_map_nolock(mm, pmd, addr, &ptl); > > + BUG_ON(!pte); > > + assert_spin_locked(ptl); > > + pte_unmap(pte); > > } > > #endif /* CONFIG_DEBUG_VM */ > > > > -- > > 2.35.3