The patch titled Subject: mm/hugetlb: Fix pgtable lock on pmd sharing has been added to the -mm mm-unstable branch. Its filename is mm-hugetlb-fix-pgtable-lock-on-pmd-sharing.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-hugetlb-fix-pgtable-lock-on-pmd-sharing.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Peter Xu <peterx@xxxxxxxxxx> Subject: mm/hugetlb: Fix pgtable lock on pmd sharing Date: Mon, 12 Jun 2023 12:04:20 -0400 Huge pmd sharing operates on PUD not PMD, huge_pte_lock() is not suitable in this case because it should only work for last level pte changes, while pmd sharing is always one level higher. Meanwhile, here we're locking over the spte pgtable lock which is even not a lock for current mm but someone else's. It seems even racy on operating on the lock, as after put_page() of the spte pgtable page logically the page can be released, so at least the spin_unlock() needs to be done after the put_page(). No report I am aware, I'm not even sure whether it'll just work on taking the spte pmd lock, because while we're holding i_mmap read lock it probably means the vma interval tree is frozen, all pte allocators over this pud entry could always find the specific svma and spte page, so maybe they'll serialize on this spte page lock? Even so, doesn't seem to be expected. It just seems to be an accident of cb900f412154. Fix it with the proper pud lock (which is the mm's page_table_lock). Link: https://lkml.kernel.org/r/20230612160420.809818-1-peterx@xxxxxxxxxx Fixes: cb900f412154 ("mm, hugetlb: convert hugetlbfs to use split pmd lock") Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Cc: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/hugetlb.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) --- a/mm/hugetlb.c~mm-hugetlb-fix-pgtable-lock-on-pmd-sharing +++ a/mm/hugetlb.c @@ -7130,7 +7130,6 @@ pte_t *huge_pmd_share(struct mm_struct * unsigned long saddr; pte_t *spte = NULL; pte_t *pte; - spinlock_t *ptl; i_mmap_lock_read(mapping); vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) { @@ -7151,7 +7150,7 @@ pte_t *huge_pmd_share(struct mm_struct * if (!spte) goto out; - ptl = huge_pte_lock(hstate_vma(vma), mm, spte); + spin_lock(&mm->page_table_lock); if (pud_none(*pud)) { pud_populate(mm, pud, (pmd_t *)((unsigned long)spte & PAGE_MASK)); @@ -7159,7 +7158,7 @@ pte_t *huge_pmd_share(struct mm_struct * } else { put_page(virt_to_page(spte)); } - spin_unlock(ptl); + spin_unlock(&mm->page_table_lock); out: pte = (pte_t *)pmd_alloc(mm, pud, addr); i_mmap_unlock_read(mapping); _ Patches currently in -mm which might be from peterx@xxxxxxxxxx are mm-hugetlb-fix-pgtable-lock-on-pmd-sharing.patch