On Wed 18-03-20 10:55:29, Roman Gushchin wrote: > On Wed, Mar 18, 2020 at 05:16:25PM +0100, Michal Hocko wrote: > > On Wed 18-03-20 08:34:24, Roman Gushchin wrote: > > > If CONFIG_NUMA isn't set, there is no need to ensure that > > > the hugetlb cma area belongs to a specific numa node. > > > > > > min/max_low_pfn can be used for limiting the maximum size > > > of the hugetlb_cma area. > > > > > > Also for_each_mem_pfn_range() is defined only if > > > CONFIG_HAVE_MEMBLOCK_NODE_MAP is set, and on arm (unlike most > > > other architectures) it depends on CONFIG_NUMA. This makes the > > > build fail if CONFIG_NUMA isn't set. > > > > CONFIG_HAVE_MEMBLOCK_NODE_MAP has popped out as a problem several times > > already. Is there any real reason we cannot make it unconditional? > > Essentially make the functionality always enabled and drop the config? > > It depends on CONFIG_NUMA only on arm, and I really don't know > if there is a good justification for it. It not, that will be a much > simpler fix. I have checked the history and the dependency is there since NUMA was introduced in arm64. So it would be great to double check with arch maintainers. > > The code below is ugly as hell. Just look at it. You have > > for_each_node_state without any ifdefery but the having ifdef > > CONFIG_NUMA. That just doesn't make any sense. > > I don't think it makes no sense: > it tries to reserve a cma area on each node (need for_each_node_state()), > and it uses the for_each_mem_pfn_range() to get a min and max pfn > for each node. With !CONFIG_NUMA the first part is reduced to one > iteration and the second part is not required at all. Sure the resulting code logic makes sense. I meant that it doesn't make much sense wrt readability. There is a loop over all existing numa nodes to have ifdef for NUMA inside the loop. See? > I agree that for_each_mem_pfn_range() here looks quite ugly, but I don't know > of a better way to get min/max pfns for a node so early in the boot process. > If somebody has any ideas here, I'll appreciate a lot. The loop is ok. Maybe we have other memblock API that would be better but I am not really aware of it from top of my head. I would stick with it. It just sucks that this API depends on HAVE_MEMBLOCK_NODE_MAP and that it is not generally available. This is what I am complaining about. Just look what kind of dirty hack it made you to create ;) > I know Rik plans some further improvements here, so the goal for now > is to fix the build. If you think that enabling CONFIG_HAVE_MEMBLOCK_NODE_MAP > unconditionally is a way to go, I'm fine with it too. This is not the first time HAVE_MEMBLOCK_NODE_MAP has been problematic. I might be missing something but I really do not get why do we really need it these days. As for !NUMA, I suspect we can make it generate the right thing when !NUMA. -- Michal Hocko SUSE Labs