On Tue, 21 Aug 2018 14:30:24 +0200 Oscar Salvador <osalvador@xxxxxxxxxxxxxxxxxx> wrote: > On Tue, Aug 21, 2018 at 02:17:34PM +0200, Michal Hocko wrote: > > We do have CONFIG_NODES_SHIFT=10 in our SLES kernels for quite some > > time (around SLE11-SP3 AFAICS). > > > > Anyway, isn't NODES_ALLOC over engineered a bit? Does actually even do > > larger than 1024 NUMA nodes? This would be 128B and from a quick glance > > it seems that none of those functions are called in deep stacks. I > > haven't gone through all of them but a patch which checks them all and > > removes NODES_ALLOC would be quite nice IMHO. > > No, maximum we can get is 1024 NUMA nodes. > I checked this when writing another patch [1], and since having gone > through all archs Kconfigs, CONFIG_NODES_SHIFT=10 is the limit. > > NODEMASK_ALLOC gets only called from: > > - unregister_mem_sect_under_nodes() (not anymore after [1]) > - __nr_hugepages_store_common (This does not seem to have a deep stack, we could use a normal nodemask_t) > > But is also used for NODEMASK_SCRATCH (mainly used for mempolicy): > > struct nodemask_scratch { > nodemask_t mask1; > nodemask_t mask2; > }; > > that would make 256 bytes in case CONFIG_NODES_SHIFT=10. And that sole site could use an open-coded kmalloc.