Le 15/04/2024 à 21:12, Christophe Leroy a écrit : > > > Le 12/04/2024 à 16:30, Peter Xu a écrit : >> On Fri, Apr 12, 2024 at 02:08:03PM +0000, Christophe Leroy wrote: >>> >>> >>> Le 11/04/2024 à 18:15, Peter Xu a écrit : >>>> On Mon, Mar 25, 2024 at 01:38:40PM -0300, Jason Gunthorpe wrote: >>>>> On Mon, Mar 25, 2024 at 03:55:53PM +0100, Christophe Leroy wrote: >>>>>> This series reimplements hugepages with hugepd on powerpc 8xx. >>>>>> >>>>>> Unlike most architectures, powerpc 8xx HW requires a two-level >>>>>> pagetable topology for all page sizes. So a leaf PMD-contig approach >>>>>> is not feasible as such. >>>>>> >>>>>> Possible sizes are 4k, 16k, 512k and 8M. >>>>>> >>>>>> First level (PGD/PMD) covers 4M per entry. For 8M pages, two PMD >>>>>> entries >>>>>> must point to a single entry level-2 page table. Until now that was >>>>>> done using hugepd. This series changes it to use standard page tables >>>>>> where the entry is replicated 1024 times on each of the two >>>>>> pagetables >>>>>> refered by the two associated PMD entries for that 8M page. >>>>>> >>>>>> At the moment it has to look into each helper to know if the >>>>>> hugepage ptep is a PTE or a PMD in order to know it is a 8M page or >>>>>> a lower size. I hope this can me handled by core-mm in the future. >>>>>> >>>>>> There are probably several ways to implement stuff, so feedback is >>>>>> very welcome. >>>>> >>>>> I thought it looks pretty good! >>>> >>>> I second it. >>>> >>>> I saw the discussions in patch 1. Christophe, I suppose you're >>>> exploring >>>> the big hammer over hugepd, and perhaps went already with the 32bit pmd >>>> solution for nohash/32bit challenge you mentioned? >>>> >>>> I'm trying to position my next step; it seems like at least I should >>>> not >>>> adding any more hugepd code, then should I go with ARCH_HAS_HUGEPD >>>> checks, >>>> or you're going to have an RFC soon then I can base on top? >>> >>> Depends on what you expect by "soon". >>> >>> I sure won't be able to send any RFC before end of April. >>> >>> Should be possible to have something during May. >> >> That's good enough, thanks. I'll see what is the best I can do. >> >> Then do you think I can leave p4d/pgd leaves alone? Please check the >> other >> email where I'm not sure whether pgd leaves ever existed for any of >> PowerPC. That's so far what I plan to do, on teaching pgtable walkers >> recognize pud and lower for all leaves. Then if Power can switch from >> hugepd to this it should just work. > > Well, if I understand correctly, something with no PMD will include > <asm-generic/pgtable-nopmd.h> and will therefore only have .... pmd > entries (hence no pgd/p4d/pud entries). Looks odd but that's what it is. > pgd_populate(), p4d_populate(), pud_populate() are all "do { } while > (0)" and only pmd_populate exists. So only pmd_leaf() will exist in that > case. > > And therefore including <asm-generic/pgtable-nop4d.h> means .... you > have p4d entries. Doesn't mean you have p4d_leaf() but that needs to be > checked. > > >> >> Even if pgd exists (then something I overlooked..), I'm wondering whether >> we can push that downwards to be either pud/pmd (and looks like we all >> agree p4d is never used on Power). That may involve some pgtable >> operations moving from pgd level to lower, e.g. my pure imagination would >> look like starting with: > > Yes I think there is no doubt that p4d is never used: > > arch/powerpc/include/asm/book3s/32/pgtable.h:#include > <asm-generic/pgtable-nopmd.h> > arch/powerpc/include/asm/book3s/64/pgtable.h:#include > <asm-generic/pgtable-nop4d.h> > arch/powerpc/include/asm/nohash/32/pgtable.h:#include > <asm-generic/pgtable-nopmd.h> > arch/powerpc/include/asm/nohash/64/pgtable-4k.h:#include > <asm-generic/pgtable-nop4d.h> > > But that means that PPC32 have pmd entries and PPC64 have p4d entries ... > >> >> #define PTE_INDEX_SIZE PTE_SHIFT >> #define PMD_INDEX_SIZE 0 >> #define PUD_INDEX_SIZE 0 >> #define PGD_INDEX_SIZE (32 - PGDIR_SHIFT) >> >> To: >> >> #define PTE_INDEX_SIZE PTE_SHIFT >> #define PMD_INDEX_SIZE (32 - PMD_SHIFT) >> #define PUD_INDEX_SIZE 0 >> #define PGD_INDEX_SIZE 0 > > But then you can't anymore have #define PTRS_PER_PMD 1 from > <asm-generic/pgtable-nop4d.h> > >> >> And the rest will need care too. I hope moving downward is easier >> (e.g. the walker should always exist for lower levels but not always for >> higher levels), but I actually have little idea on whether there's any >> other implications, so please bare with me on stupid mistakes. >> >> I just hope pgd leaves don't exist already, then I think it'll be >> simpler. >> >> Thanks, >> Digging into asm-generic/pgtable-nopmd.h, I see a definition of pud_leaf() always returning 0, introduced by commit 2c8a81dc0cc5 ("riscv/mm: fix two page table check related issues") So should asm-generic/pgtable-nopud.h contain the same for p4d_leaf() and asm-generic/pgtable-nop4d.h contain the same for pgd_leaf() ? Christophe