Le 22/11/2023 à 16:22, Peter Xu a écrit : > On Wed, Nov 22, 2023 at 12:00:24AM -0800, Christoph Hellwig wrote: >> On Tue, Nov 21, 2023 at 10:59:35AM -0500, Peter Xu wrote: >>>> What prevents us from ever using hugepd with file mappings? I think >>>> it would naturally fit in with how large folios for the pagecache work. >>>> >>>> So keeping this check and generalizing it seems like the better idea to >>>> me. >>> >>> But then it means we're still keeping that dead code for fast-gup even if >>> we know that fact.. Or do we have a plan to add that support very soon, so >>> this code will be destined to add back? >> >> The question wasn't mean retorical - we support arbitrary power of two >> sized folios for the pagepage, what prevents us from using hugepd with >> them right now? > > Ah, didn't catch that point previously. Hugepd is just not used outside > hugetlb right now, afaiu. > > For example, __hugepte_alloc() (and that's the only one calls > hugepd_populate()) should be the function to allocate a hugepd (ppc only), > and it's only called in huge_pte_alloc(), which is part of the current > arch-specific hugetlb api. > > And generic mm paths don't normally have hugepd handling, afaics. For > example, page_vma_mapped_walk() doesn't handle hugepd at all unless in > hugetlb specific path. > > There're actually (only) two generic mm paths that can handle hugepd, > namely: > > - fast-gup > - walk_page_*() apis (aka, __walk_page_range()) > > For fast-gup I think the hugepd code is in use, however for walk_page_* > apis hugepd code shouldn't be reached iiuc as we have the hugetlb specific > handling (walk_hugetlb_range()), so anything within walk_pgd_range() to hit > a hugepd can be dead code to me (but note that this "dead code" is good > stuff to me, if one would like to merge hugetlb instead into generic mm). Not sure what you mean here. What do you mean by "dead code" ? A hugepage directory can be plugged at any page level, from PGD to PMD. So the following bit in walk_pgd_range() is valid and not dead: if (is_hugepd(__hugepd(pgd_val(*pgd)))) err = walk_hugepd_range((hugepd_t *)pgd, addr, next, walk, PGDIR_SHIFT); > > This series tries to add slow gup into that list too, so the 3rd one to > support it. I plan to look more into this area (e.g., __walk_page_range() > can be another good candidate soon). I'm not sure whether we should teach > the whole mm to understand hugepd yet, but slow gup and __walk_page_range() > does look like good candidates to already remove the hugetlb specific code > paths - slow-gup has average ~add/~del LOCs (which this series does), and > __walk_page_range() can remove some code logically, no harm I yet see. > > Indeed above are based on only my code observations, so I'll be more than > happy to be corrected otherwise, as early as possible. > >> >>> The other option is I can always add a comment above gup_huge_pd() >>> explaining this special bit, so that when someone is adding hugepd support >>> to file large folios we'll hopefully not forget it? But then that >>> generalization work will only happen when the code will be needed. >> >> If dropping the check is the right thing for now (and I think the ppc >> maintainers and willy as the large folio guy might have a more useful >> opinions than I do), leaving a comment in would be very useful. > > Willy is in the loop, and I just notice I didn't really copy ppc list, even > I planned to.. I am adding the list (linuxppc-dev@xxxxxxxxxxxxxxxx) into > this reply. I'll remember to do so as long as there's a new version. > > The other reason I feel like hugepd may or may not be further developed for > new features like large folio is that I saw Power9 started to shift to > radix pgtables, and afaics hugepd is only supported in hash tables > (hugepd_ok()). But again, I confess I know nothing about Power at all. > > Thanks, >