> On Aug 9, 2019, at 9:35 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > > On 08/08, Song Liu wrote: >> >>> On Aug 8, 2019, at 9:37 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: >>> >>> On 08/07, Song Liu wrote: >>>> >>>> @@ -399,7 +399,7 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, >>>> spin_unlock(ptl); >>>> return follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); >>>> } >>>> - if (flags & FOLL_SPLIT) { >>>> + if (flags & (FOLL_SPLIT | FOLL_SPLIT_PMD)) { >>>> int ret; >>>> page = pmd_page(*pmd); >>>> if (is_huge_zero_page(page)) { >>>> @@ -408,7 +408,7 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, >>>> split_huge_pmd(vma, pmd, address); >>>> if (pmd_trans_unstable(pmd)) >>>> ret = -EBUSY; >>>> - } else { >>>> + } else if (flags & FOLL_SPLIT) { >>>> if (unlikely(!try_get_page(page))) { >>>> spin_unlock(ptl); >>>> return ERR_PTR(-ENOMEM); >>>> @@ -420,6 +420,10 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, >>>> put_page(page); >>>> if (pmd_none(*pmd)) >>>> return no_page_table(vma, flags); >>>> + } else { /* flags & FOLL_SPLIT_PMD */ >>>> + spin_unlock(ptl); >>>> + split_huge_pmd(vma, pmd, address); >>>> + ret = pte_alloc(mm, pmd) ? -ENOMEM : 0; >>>> } >>> >>> Can't resist, let me repeat that I do not like this patch because imo >>> it complicates this code for no reason. >> >> Personally, I don't think this is more complicated than your version. > > I do, but of course this is subjective. > >> Also, if some code calls follow_pmd_mask() with flags contains both >> FOLL_SPLIT and FOLL_SPLIT_PMD, we should honor FOLL_SPLIT and split the >> huge page. > > Heh. why not other way around? Because FOLL_SPLIT splits both the page and the pmd. FOLL_SPLIT_PMD only splits the pmd, so it is a subset of FOLL_SPLIT. When the user sets both, we should split both the page and the pmd. Thanks, Song