On 5 Feb 2017, at 22:02, Hillf Danton wrote: > On February 06, 2017 12:13 AM Zi Yan wrote: >> >> @@ -1233,33 +1233,31 @@ static inline unsigned long zap_pmd_range(struct mmu_gather *tlb, >> struct zap_details *details) >> { >> pmd_t *pmd; >> + spinlock_t *ptl; >> unsigned long next; >> >> pmd = pmd_offset(pud, addr); >> + ptl = pmd_lock(vma->vm_mm, pmd); >> do { >> next = pmd_addr_end(addr, end); >> if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { >> if (next - addr != HPAGE_PMD_SIZE) { >> VM_BUG_ON_VMA(vma_is_anonymous(vma) && >> !rwsem_is_locked(&tlb->mm->mmap_sem), vma); >> - __split_huge_pmd(vma, pmd, addr, false, NULL); >> - } else if (zap_huge_pmd(tlb, vma, pmd, addr)) >> - goto next; >> + __split_huge_pmd_locked(vma, pmd, addr, false); >> + } else if (__zap_huge_pmd_locked(tlb, vma, pmd, addr)) >> + continue; >> /* fall through */ >> } >> - /* >> - * Here there can be other concurrent MADV_DONTNEED or >> - * trans huge page faults running, and if the pmd is >> - * none or trans huge it can change under us. This is >> - * because MADV_DONTNEED holds the mmap_sem in read >> - * mode. >> - */ >> - if (pmd_none_or_trans_huge_or_clear_bad(pmd)) >> - goto next; >> + >> + if (pmd_none_or_clear_bad(pmd)) >> + continue; >> + spin_unlock(ptl); >> next = zap_pte_range(tlb, vma, pmd, addr, next, details); >> -next: >> cond_resched(); >> + spin_lock(ptl); >> } while (pmd++, addr = next, addr != end); > > spin_lock() is appointed to the bench of pmd_lock(). Any problem with this? The code is trying to lock this PMD page to avoid other changes and only unlock it when we want to go deeper to PTE range. Locking the PMD page for at most 512-entry handling should be acceptable, since zap_pte_range() does similar work for 512 PTEs. > >> + spin_unlock(ptl); >> >> return addr; >> } -- Best Regards Yan Zi
Attachment:
signature.asc
Description: OpenPGP digital signature