On 3/24/20 8:55 AM, Jason Gunthorpe wrote: > On Tue, Mar 24, 2020 at 08:25:09AM -0700, Mike Kravetz wrote: >> On 3/24/20 4:55 AM, Jason Gunthorpe wrote: >>> Also, since CH moved all the get_user_pages_fast code out of the >>> arch's many/all archs can drop their arch specific version of this >>> routine. This is really just a specialized version of gup_fast's >>> algorithm.. >>> >>> (also the arch versions seem different, why do some return actual >>> ptes, not null?) >> >> Not sure I understand that last question. The return value should be >> a *pte or null. > > I mean the common code ends like this: > > pmd = pmd_offset(pud, addr); > if (sz != PMD_SIZE && pmd_none(*pmd)) > return NULL; > /* hugepage or swap? */ > if (pmd_huge(*pmd) || !pmd_present(*pmd)) > return (pte_t *)pmd; > > return NULL; > > So it always returns a pointer into a PUD or PMD, while say, ppc > in __find_linux_pte() ends like: > > return pte_offset_kernel(&pmd, ea); > > Which is pointing to a PTE Ok, now I understand the question. huge_pte_offset will/should only be called for addresses that are in a vma backed by hugetlb pages. So, pte_offset_kernel() will only return page table type (PUD/PMD/etc) associated with a huge page supported by the particular arch. > So does sparc: > > pmd = pmd_offset(pud, addr); > if (pmd_none(*pmd)) > return NULL; > if (is_hugetlb_pmd(*pmd)) > return (pte_t *)pmd; > return pte_offset_map(pmd, addr); > > Which is even worse because it is leaking a kmap.. > > etc > >> /* >> * huge_pte_offset() - Walk the page table to resolve the hugepage >> * entry at address @addr >> * >> * Return: Pointer to page table or swap entry (PUD or PMD) for > ^^^^^^^^^^^^^^^^^^^ > > Ie the above is not followed by the archs > > I'm also scratching my head that a function that returns a pte_t * > always returns a PUD or PMD. Strange bit of type casting.. Yes, the casting is curious. The casting continues in potential subsequent calls to huge_pte_alloc(). -- Mike Kravetz