On Tue, Mar 24, 2020 at 09:19:29AM -0700, Mike Kravetz wrote: > On 3/24/20 8:55 AM, Jason Gunthorpe wrote: > > On Tue, Mar 24, 2020 at 08:25:09AM -0700, Mike Kravetz wrote: > >> On 3/24/20 4:55 AM, Jason Gunthorpe wrote: > >>> Also, since CH moved all the get_user_pages_fast code out of the > >>> arch's many/all archs can drop their arch specific version of this > >>> routine. This is really just a specialized version of gup_fast's > >>> algorithm.. > >>> > >>> (also the arch versions seem different, why do some return actual > >>> ptes, not null?) > >> > >> Not sure I understand that last question. The return value should be > >> a *pte or null. > > > > I mean the common code ends like this: > > > > pmd = pmd_offset(pud, addr); > > if (sz != PMD_SIZE && pmd_none(*pmd)) > > return NULL; > > /* hugepage or swap? */ > > if (pmd_huge(*pmd) || !pmd_present(*pmd)) > > return (pte_t *)pmd; > > > > return NULL; > > > > So it always returns a pointer into a PUD or PMD, while say, ppc > > in __find_linux_pte() ends like: > > > > return pte_offset_kernel(&pmd, ea); > > > > Which is pointing to a PTE > > Ok, now I understand the question. huge_pte_offset will/should only be > called for addresses that are in a vma backed by hugetlb pages. So, > pte_offset_kernel() will only return page table type (PUD/PMD/etc) associated > with a huge page supported by the particular arch. I thought pte_offset_kernel always returns PTEs (ie the 4k entries on x86), I suppose what you are saying is that since the caller knows this is always a PUD or PMD due to the VMA the pte_offset is dead code. > > So does sparc: > > > > pmd = pmd_offset(pud, addr); > > if (pmd_none(*pmd)) > > return NULL; > > if (is_hugetlb_pmd(*pmd)) > > return (pte_t *)pmd; > > return pte_offset_map(pmd, addr); > > > > Which is even worse because it is leaking a kmap.. Particularly here which is buggy dead code :) Jason