> On Jun 13, 2019, at 7:16 AM, Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote: > > On Thu, Jun 13, 2019 at 01:57:30PM +0000, Song Liu wrote: >>> And I'm not convinced that it belongs here at all. User requested PMD >>> split and it is done after split_huge_pmd(). The rest can be handled by >>> the caller as needed. >> >> I put this part here because split_huge_pmd() for file-backed THP is >> not really done after split_huge_pmd(). And I would like it done before >> calling follow_page_pte() below. Maybe we can still do them here, just >> for file-backed THPs? >> >> If we would move it, shall we move to callers of follow_page_mask()? >> In that case, we will probably end up with similar code in two places: >> __get_user_pages() and follow_page(). >> >> Did I get this right? > > Would it be enough to replace pte_offset_map_lock() in follow_page_pte() > with pte_alloc_map_lock()? This is similar to my previous version: + } else { /* flags & FOLL_SPLIT_PMD */ + pte_t *pte; + spin_unlock(ptl); + split_huge_pmd(vma, pmd, address); + pte = get_locked_pte(mm, address, &ptl); + if (!pte) + return no_page_table(vma, flags); + spin_unlock(ptl); + ret = 0; + } I think this is cleaner than use pte_alloc_map_lock() in follow_page_pte(). What's your thought on these two versions (^^^ vs. pte_alloc_map_lock)? > This will leave bunch not populated PTE entries, but it is fine: they will > be populated on the next access to them. We need to handle page fault during next access, right? Since we already allocated everything, we can just populate the PTE entries and saves a lot of page faults (assuming we will access them later). Thanks, Song > > -- > Kirill A. Shutemov