> On Aug 8, 2019, at 9:37 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > > On 08/07, Song Liu wrote: >> >> @@ -399,7 +399,7 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, >> spin_unlock(ptl); >> return follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); >> } >> - if (flags & FOLL_SPLIT) { >> + if (flags & (FOLL_SPLIT | FOLL_SPLIT_PMD)) { >> int ret; >> page = pmd_page(*pmd); >> if (is_huge_zero_page(page)) { >> @@ -408,7 +408,7 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, >> split_huge_pmd(vma, pmd, address); >> if (pmd_trans_unstable(pmd)) >> ret = -EBUSY; >> - } else { >> + } else if (flags & FOLL_SPLIT) { >> if (unlikely(!try_get_page(page))) { >> spin_unlock(ptl); >> return ERR_PTR(-ENOMEM); >> @@ -420,6 +420,10 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, >> put_page(page); >> if (pmd_none(*pmd)) >> return no_page_table(vma, flags); >> + } else { /* flags & FOLL_SPLIT_PMD */ >> + spin_unlock(ptl); >> + split_huge_pmd(vma, pmd, address); >> + ret = pte_alloc(mm, pmd) ? -ENOMEM : 0; >> } > > Can't resist, let me repeat that I do not like this patch because imo > it complicates this code for no reason. Personally, I don't think this is more complicated than your version. This patch is safe as it doesn't change any code for is_huge_zero_page() case. Also, if some code calls follow_pmd_mask() with flags contains both FOLL_SPLIT and FOLL_SPLIT_PMD, we should honor FOLL_SPLIT and split the huge page. Of course, there is no code that sets both flags. Does this resolve your concern here? Thanks, Song