"Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxx> writes: > This is in preparation to update radix to implement vmemmap optimization > for devdax. Below are the rules w.r.t radix vmemmap mapping > > 1. First try to map things using PMD (2M) > 2. With altmap if altmap cross-boundary check returns true, fall back to > PAGE_SIZE > 3. If we can't allocate PMD_SIZE backing memory for vmemmap, fallback to > PAGE_SIZE > > On removing vmemmap mapping, check if every subsection that is using the > vmemmap area is invalid. If found to be invalid, that implies we can safely > free the vmemmap area. We don't use the PAGE_UNUSED pattern used by x86 > because with 64K page size, we need to do the above check even at the > PAGE_SIZE granularity. > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx> > --- > arch/powerpc/include/asm/book3s/64/radix.h | 2 + > arch/powerpc/include/asm/pgtable.h | 4 + > arch/powerpc/mm/book3s64/radix_pgtable.c | 326 +++++++++++++++++++-- > arch/powerpc/mm/init_64.c | 26 +- > 4 files changed, 327 insertions(+), 31 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h > index 2ef92f36340f..f1461289643a 100644 > --- a/arch/powerpc/include/asm/book3s/64/radix.h > +++ b/arch/powerpc/include/asm/book3s/64/radix.h > @@ -331,6 +331,8 @@ extern int __meminit radix__vmemmap_create_mapping(unsigned long start, > unsigned long phys); > int __meminit radix__vmemmap_populate(unsigned long start, unsigned long end, > int node, struct vmem_altmap *altmap); > +void __ref radix__vmemmap_free(unsigned long start, unsigned long end, > + struct vmem_altmap *altmap); > extern void radix__vmemmap_remove_mapping(unsigned long start, > unsigned long page_size); > > diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h > index 6a88bfdaa69b..68817ea7f994 100644 > --- a/arch/powerpc/include/asm/pgtable.h > +++ b/arch/powerpc/include/asm/pgtable.h > @@ -165,6 +165,10 @@ static inline bool is_ioremap_addr(const void *x) > > return addr >= IOREMAP_BASE && addr < IOREMAP_END; > } > + > +int __meminit vmemmap_populated(unsigned long vmemmap_addr, int vmemmap_map_size); > +bool altmap_cross_boundary(struct vmem_altmap *altmap, unsigned long start, > + unsigned long page_size); > #endif /* CONFIG_PPC64 */ > > #endif /* __ASSEMBLY__ */ > diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c > index 227fea53c217..9a7f3707b6fb 100644 > --- a/arch/powerpc/mm/book3s64/radix_pgtable.c > +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c > @@ -744,8 +744,59 @@ static void free_pud_table(pud_t *pud_start, p4d_t *p4d) > p4d_clear(p4d); > } > > +#ifdef CONFIG_SPARSEMEM_VMEMMAP > +static bool __meminit vmemmap_pmd_is_unused(unsigned long addr, unsigned long end) > +{ > + unsigned long start = ALIGN_DOWN(addr, PMD_SIZE); > + > + return !vmemmap_populated(start, PMD_SIZE); > +} > + > +static bool __meminit vmemmap_page_is_unused(unsigned long addr, unsigned long end) > +{ > + unsigned long start = ALIGN_DOWN(addr, PAGE_SIZE); > + > + return !vmemmap_populated(start, PAGE_SIZE); > + > +} > +#endif > + > +static void __meminit free_vmemmap_pages(struct page *page, > + struct vmem_altmap *altmap, > + int order) > +{ > + unsigned int nr_pages = 1 << order; > + > + if (altmap) { > + unsigned long alt_start, alt_end; > + unsigned long base_pfn = page_to_pfn(page); > + > + /* > + * with 2M vmemmap mmaping we can have things setup > + * such that even though atlmap is specified we never > + * used altmap. > + */ > + alt_start = altmap->base_pfn; > + alt_end = altmap->base_pfn + altmap->reserve + > + altmap->free + altmap->alloc + altmap->align; > + > + if (base_pfn >= alt_start && base_pfn < alt_end) { > + vmem_altmap_free(altmap, nr_pages); > + return; > + } > + } > + Please take this diff on top of this patch when adding this series to -mm . commit 613569d9517be60611a86bf4b9821b150c4c4954 Author: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx> Date: Mon Jul 24 22:49:29 2023 +0530 powerpc/mm/altmap: Fix altmap boundary check altmap->free includes the entire free space from which altmap blocks can be allocated. So when checking whether the kernel is doing altmap block free, compute the boundary correctly. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c index 7761c2e93bff..ed63c2953b54 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -766,8 +766,7 @@ static void __meminit free_vmemmap_pages(struct page *page, * used altmap. */ alt_start = altmap->base_pfn; - alt_end = altmap->base_pfn + altmap->reserve + - altmap->free + altmap->alloc + altmap->align; + alt_end = altmap->base_pfn + altmap->reserve + altmap->free; if (base_pfn >= alt_start && base_pfn < alt_end) { vmem_altmap_free(altmap, nr_pages);