Joao Martins <joao.m.martins@xxxxxxxxxx> writes: > On 11/04/2023 15:18, Aneesh Kumar K.V wrote: >> commit 4917f55b4ef9 ("mm/sparse-vmemmap: improve memory savings for compound >> devmaps") added support for using optimized vmmemap for devdax devices. But how >> vmemmap mappings are created are architecture specific. For example, powerpc >> with hash translation doesn't have vmemmap mappings in init_mm page table >> instead they are bolted table entries in the hardware page table >> >> vmemmap_populate_compound_pages() used by vmemmap optimization code is not aware >> of these architecture-specific mapping. Hence allow architecture to opt for this >> feature. I selected architectures supporting HUGETLB_PAGE_OPTIMIZE_VMEMMAP >> option as also supporting this feature. >> > Perhaps rephrase last sentence to be in imperative form e.g.: > > Architectures supporting HUGETLB_PAGE_OPTIMIZE_VMEMMAP option are selected when > supporting this feature. > >> This patch fixes the below crash on ppc64. >> > Avoid the 'This patch' per submission guidelines e.g. > > 'On ppc64 (pmem) where this isn't supported, it fixes below crash:' ok will update in the next version. > >> BUG: Unable to handle kernel data access on write at 0xc00c000100400038 >> Faulting instruction address: 0xc000000001269d90 >> Oops: Kernel access of bad area, sig: 11 [#1] >> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries >> Modules linked in: > ... >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index 716d30d93616..c47f2186d2c2 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -3442,6 +3442,22 @@ void vmemmap_populate_print_last(void); >> void vmemmap_free(unsigned long start, unsigned long end, >> struct vmem_altmap *altmap); >> #endif >> + >> +#ifdef CONFIG_ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMA > > You are missing a 'P' I noticed that after sending v2 and sent v2-updated with that change. > >> +static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap, >> + struct dev_pagemap *pgmap) >> +{ >> + return is_power_of_2(sizeof(struct page)) && >> + pgmap && (pgmap_vmemmap_nr(pgmap) > 1) && !altmap; >> +} >> +#else >> +static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap, >> + struct dev_pagemap *pgmap) >> +{ >> + return false; >> +} >> +#endif >> + >> void register_page_bootmem_memmap(unsigned long section_nr, struct page *map, >> unsigned long nr_pages); >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 3bb3484563ed..292411d8816f 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -6844,10 +6844,13 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, >> * of an altmap. See vmemmap_populate_compound_pages(). >> */ >> static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap, >> + struct dev_pagemap *pgmap, >> unsigned long nr_pages) >> { >> - return is_power_of_2(sizeof(struct page)) && >> - !altmap ? 2 * (PAGE_SIZE / sizeof(struct page)) : nr_pages; >> + if (vmemmap_can_optimize(altmap, pgmap)) >> + return 2 * (PAGE_SIZE / sizeof(struct page)); >> + else >> + return nr_pages; >> } >> > > Keep the ternary operator as already is the case for compound_nr_pages to avoid > doing too much in one patch: > > return vmemmap_can_optimize(altmap, pgmap) ? > 2 * (PAGE_SIZE / sizeof(struct page)) : nr_pages; > > Or you really want to remove the ternary operator perhaps take the unnecessary > else and make the long line be less indented: > > if (!vmemmap_can_optimize(altmap, pgmap)) > return nr_pages; > > return 2 * (PAGE_SIZE / sizeof(struct page)); > > I don't think the latter is a significant improvement over the ternary one. But > I guess that's a matter of preferred style. How about static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { if (!vmemmap_can_optimize(altmap, pgmap)) return pgmap_vmemmap_nr(pgmap); return 2 * (PAGE_SIZE / sizeof(struct page)); } > >> static void __ref memmap_init_compound(struct page *head, >> @@ -6912,7 +6915,7 @@ void __ref memmap_init_zone_device(struct zone *zone, >> continue; >> >> memmap_init_compound(page, pfn, zone_idx, nid, pgmap, >> - compound_nr_pages(altmap, pfns_per_compound)); >> + compound_nr_pages(altmap, pgmap, pfns_per_compound)); >> } >> >> pr_info("%s initialised %lu pages in %ums\n", __func__, >> diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c >> index c5398a5960d0..10d73a0dfcec 100644 >> --- a/mm/sparse-vmemmap.c >> +++ b/mm/sparse-vmemmap.c >> @@ -458,8 +458,7 @@ struct page * __meminit __populate_section_memmap(unsigned long pfn, >> !IS_ALIGNED(nr_pages, PAGES_PER_SUBSECTION))) >> return NULL; >> >> - if (is_power_of_2(sizeof(struct page)) && >> - pgmap && pgmap_vmemmap_nr(pgmap) > 1 && !altmap) >> + if (vmemmap_can_optimize(altmap, pgmap)) >> r = vmemmap_populate_compound_pages(pfn, start, end, nid, pgmap); >> else >> r = vmemmap_populate(start, end, nid, altmap); -aneesh