Re: [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is reserved

Roman Gushchin <guro@xxxxxx> · Mon, 6 Jul 2020 16:26:01 -0700

On Mon, Jul 06, 2020 at 10:30:40PM +0000, Song Bao Hua (Barry Song) wrote:
> 
> 
> > -----Original Message-----
> > From: Song Bao Hua (Barry Song)
> > Sent: Tuesday, July 7, 2020 10:12 AM
> > To: 'Roman Gushchin' <guro@xxxxxx>
> > Cc: akpm@xxxxxxxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> > linux-kernel@xxxxxxxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>; Mike
> > Kravetz <mike.kravetz@xxxxxxxxxx>; Jonathan Cameron
> > <jonathan.cameron@xxxxxxxxxx>
> > Subject: RE: [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is
> > reserved
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: Roman Gushchin [mailto:guro@xxxxxx]
> > > Sent: Tuesday, July 7, 2020 9:48 AM
> > > To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>
> > > Cc: akpm@xxxxxxxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> > > linux-kernel@xxxxxxxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>; Mike
> > > Kravetz <mike.kravetz@xxxxxxxxxx>; Jonathan Cameron
> > > <jonathan.cameron@xxxxxxxxxx>
> > > Subject: Re: [PATCH] mm/hugetlb: avoid hardcoding while checking if
> > > cma is reserved
> > >
> > > On Mon, Jul 06, 2020 at 08:44:05PM +1200, Barry Song wrote:
> > >
> > > Hello, Barry!
> > >
> > > > hugetlb_cma[0] can be NULL due to various reasons, for example,
> > > > node0 has no memory. Thus, NULL hugetlb_cma[0] doesn't necessarily
> > > > mean cma is not enabled. gigantic pages might have been reserved on
> > other nodes.
> > >
> > > Just curious, is it a real-life problem you've seen? If so, I wonder
> > > how you're using the hugetlb_cma option, and what's the outcome?
> > 
> > Yes. It is kind of stupid but I once got a board on which node0 has no DDR
> > though node1 and node3 have memory.
> > 
> > I actually prefer we get cma size of per node by:
> > cma size of one node = hugetlb_cma/ (nodes with memory) rather than:
> > cma size of one node = hugetlb_cma/ (all online nodes)
> > 
> > but unfortunately, or the N_MEMORY infrastructures are not ready yet. I
> > mean:
> > 
> > for_each_node_state(nid, N_MEMORY) {
> > 		int res;
> > 
> > 		size = min(per_node, hugetlb_cma_size - reserved);
> > 		size = round_up(size, PAGE_SIZE << order);
> > 
> > 		res = cma_declare_contiguous_nid(0, size, 0, PAGE_SIZE << order,
> > 						 0, false, "hugetlb",
> > 						 &hugetlb_cma[nid], nid);
> > 		...
> > 	}
> > 
> 
> And for a server, there are many memory slots. The best config would be
> making every node have at least one DDR. But it isn't necessarily true, it
> is totally up to the users.
> 
> If we move hugetlb_cma_reserve() a bit later, we probably make hugetlb_cma size
> completely consistent by splitting it to nodes with memory rather than nodes 
> which are online:
> 
> void __init bootmem_init(void)
> {
> 	...
> 
> 	arm64_numa_init();
> 
> 	/*
> 	 * must be done after arm64_numa_init() which calls numa_init() to
> 	 * initialize node_online_map that gets used in hugetlb_cma_reserve()
> 	 * while allocating required CMA size across online nodes.
> 	 */
> - #ifdef CONFIG_ARM64_4K_PAGES
> -	hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
> - #endif
> 
> 	...
> 
> 	sparse_init();
> 	zone_sizes_init(min, max);
> 
> + #ifdef CONFIG_ARM64_4K_PAGES
> +	hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
> + #endif
> 	memblock_dump_all();
> }
> 
> For x86, it could be done in similar way. Do you think it is worth to try?

It sounds like a good idea to me!

Thanks.