Re: [PATCH] mm: Fix comment for NODEMASK_ALLOC

Michal Hocko <mhocko@xxxxxxxxxx> · Tue, 21 Aug 2018 14:17:34 +0200

On Mon 20-08-18 14:24:40, Andrew Morton wrote:
> On Mon, 20 Aug 2018 10:55:16 +0200 Oscar Salvador <osalvador@xxxxxxxxxxxxxxxxxx> wrote:
> 
> > From: Oscar Salvador <osalvador@xxxxxxx>
> > 
> > Currently, NODEMASK_ALLOC allocates a nodemask_t with kmalloc when
> > NODES_SHIFT is higher than 8, otherwise it declares it within the stack.
> > 
> > The comment says that the reasoning behind this, is that nodemask_t will be
> > 256 bytes when NODES_SHIFT is higher than 8, but this is not true.
> > For example, NODES_SHIFT = 9 will give us a 64 bytes nodemask_t.
> > Let us fix up the comment for that.
> > 
> > Another thing is that it might make sense to let values lower than 128bytes
> > be allocated in the stack.
> > Although this all depends on the depth of the stack
> > (and this changes from function to function), I think that 64 bytes
> > is something we can easily afford.
> > So we could even bump the limit by 1 (from > 8 to > 9).
> > 
> 
> I agree.  Such a change will reduce the amount of testing which the
> kmalloc version receives, but I assume there are enough people out
> there testing with large NODES_SHIFT values.

We do have CONFIG_NODES_SHIFT=10 in our SLES kernels for quite some
time (around SLE11-SP3 AFAICS).

Anyway, isn't NODES_ALLOC over engineered a bit? Does actually even do
larger than 1024 NUMA nodes? This would be 128B and from a quick glance
it seems that none of those functions are called in deep stacks. I
haven't gone through all of them but a patch which checks them all and
removes NODES_ALLOC would be quite nice IMHO.

-- 
Michal Hocko
SUSE Labs