On 5/30/22 12:56, Peter Xu wrote: > Hi, Mike, > > On Fri, May 27, 2022 at 03:58:47PM -0700, Mike Kravetz wrote: >> +unsigned long hugetlb_mask_last_hp(struct hstate *h) >> +{ >> + unsigned long hp_size = huge_page_size(h); >> + >> + if (hp_size == P4D_SIZE) >> + return PGDIR_SIZE - P4D_SIZE; >> + else if (hp_size == PUD_SIZE) >> + return P4D_SIZE - PUD_SIZE; >> + else if (hp_size == PMD_SIZE) >> + return PUD_SIZE - PMD_SIZE; >> + >> + return ~(0); >> +} > > How about: > > unsigned long hugetlb_mask_last_hp(struct hstate *h) > { > unsigned long hp_size = huge_page_size(h); > > return hp_size * (PTRS_PER_PTE - 1); > } > > ? > > This is definitely a good idea, though I'm wondering the possibility to go > one step further to make hugetlb pgtable walk just like the normal pages. > > Say, would it be non-trivial to bring some of huge_pte_offset() into the > walker functions, so that we can jump over even larger than PTRS_PER_PTE > entries (e.g. when p4d==NULL for 2m huge pages)? It's very possible I > overlooked something, though. Thanks Peter! I did think of that as well. But, I mostly wanted to throw out this simple code while the idea of optimizations for sparse address range traversing was fresh in my mind. I'll take a closer look and see if we can use those general walker routines. If we can, it would be great. -- Mike Kravetz