From: Zi Yan <ziy@xxxxxxxxxx> Originally, zap_pmd_range() checks pmd value without taking pmd lock. This can cause pmd_protnone entry not being freed. Because there are two steps in changing a pmd entry to a pmd_protnone entry. First, the pmd entry is cleared to a pmd_none entry, then, the pmd_none entry is changed into a pmd_protnone entry. The racy check, even with barrier, might only see the pmd_none entry in zap_pmd_range(), thus, the mapping is neither split nor zapped. Later, in free_pmd_range(), pmd_none_or_clear() will see the pmd_protnone entry and clear it as a pmd_bad entry. Furthermore, since the pmd_protnone entry is not properly freed, the corresponding deposited pte page table is not freed either. This causes memory leak or kernel crashing, if VM_BUG_ON() is enabled. This patch relies on __split_huge_pmd_locked() and __zap_huge_pmd_locked(). Signed-off-by: Zi Yan <zi.yan@xxxxxxxxxxxxxx> --- mm/memory.c | 24 +++++++++++------------- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 3929b015faf7..7cfdd5208ef5 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1233,33 +1233,31 @@ static inline unsigned long zap_pmd_range(struct mmu_gather *tlb, struct zap_details *details) { pmd_t *pmd; + spinlock_t *ptl; unsigned long next; pmd = pmd_offset(pud, addr); + ptl = pmd_lock(vma->vm_mm, pmd); do { next = pmd_addr_end(addr, end); if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { if (next - addr != HPAGE_PMD_SIZE) { VM_BUG_ON_VMA(vma_is_anonymous(vma) && !rwsem_is_locked(&tlb->mm->mmap_sem), vma); - __split_huge_pmd(vma, pmd, addr, false, NULL); - } else if (zap_huge_pmd(tlb, vma, pmd, addr)) - goto next; + __split_huge_pmd_locked(vma, pmd, addr, false); + } else if (__zap_huge_pmd_locked(tlb, vma, pmd, addr)) + continue; /* fall through */ } - /* - * Here there can be other concurrent MADV_DONTNEED or - * trans huge page faults running, and if the pmd is - * none or trans huge it can change under us. This is - * because MADV_DONTNEED holds the mmap_sem in read - * mode. - */ - if (pmd_none_or_trans_huge_or_clear_bad(pmd)) - goto next; + + if (pmd_none_or_clear_bad(pmd)) + continue; + spin_unlock(ptl); next = zap_pte_range(tlb, vma, pmd, addr, next, details); -next: cond_resched(); + spin_lock(ptl); } while (pmd++, addr = next, addr != end); + spin_unlock(ptl); return addr; } -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>