On 3/24/20 10:59 AM, Jason Gunthorpe wrote: > On Tue, Mar 24, 2020 at 09:19:29AM -0700, Mike Kravetz wrote: >> On 3/24/20 8:55 AM, Jason Gunthorpe wrote: >>> On Tue, Mar 24, 2020 at 08:25:09AM -0700, Mike Kravetz wrote: >>>> On 3/24/20 4:55 AM, Jason Gunthorpe wrote: >>>>> Also, since CH moved all the get_user_pages_fast code out of the >>>>> arch's many/all archs can drop their arch specific version of this >>>>> routine. This is really just a specialized version of gup_fast's >>>>> algorithm.. >>>>> >>>>> (also the arch versions seem different, why do some return actual >>>>> ptes, not null?) >>>> >>>> Not sure I understand that last question. The return value should be >>>> a *pte or null. >>> >>> I mean the common code ends like this: >>> >>> pmd = pmd_offset(pud, addr); >>> if (sz != PMD_SIZE && pmd_none(*pmd)) >>> return NULL; >>> /* hugepage or swap? */ >>> if (pmd_huge(*pmd) || !pmd_present(*pmd)) >>> return (pte_t *)pmd; >>> >>> return NULL; >>> >>> So it always returns a pointer into a PUD or PMD, while say, ppc >>> in __find_linux_pte() ends like: >>> >>> return pte_offset_kernel(&pmd, ea); >>> >>> Which is pointing to a PTE >> >> Ok, now I understand the question. huge_pte_offset will/should only be >> called for addresses that are in a vma backed by hugetlb pages. So, >> pte_offset_kernel() will only return page table type (PUD/PMD/etc) associated >> with a huge page supported by the particular arch. > > I thought pte_offset_kernel always returns PTEs (ie the 4k entries on > x86), I suppose what you are saying is that since the caller knows > this is always a PUD or PMD due to the VMA the pte_offset is dead code. Yes, for x86 the address will correspond to a PUD or PMD or NULL. For huge page mappings/vmas on x86, there are no corresponding PTEs. -- Mike Kravetz