Re: [PATCH] mm: Fix comment for NODEMASK_ALLOC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 21, 2018 at 01:51:59PM -0700, Andrew Morton wrote:
> On Tue, 21 Aug 2018 14:30:24 +0200 Oscar Salvador <osalvador@xxxxxxxxxxxxxxxxxx> wrote:
> 
> > On Tue, Aug 21, 2018 at 02:17:34PM +0200, Michal Hocko wrote:
> > > We do have CONFIG_NODES_SHIFT=10 in our SLES kernels for quite some
> > > time (around SLE11-SP3 AFAICS).
> > > 
> > > Anyway, isn't NODES_ALLOC over engineered a bit? Does actually even do
> > > larger than 1024 NUMA nodes? This would be 128B and from a quick glance
> > > it seems that none of those functions are called in deep stacks. I
> > > haven't gone through all of them but a patch which checks them all and
> > > removes NODES_ALLOC would be quite nice IMHO.
> > 
> > No, maximum we can get is 1024 NUMA nodes.
> > I checked this when writing another patch [1], and since having gone
> > through all archs Kconfigs, CONFIG_NODES_SHIFT=10 is the limit.
> > 
> > NODEMASK_ALLOC gets only called from:
> > 
> > - unregister_mem_sect_under_nodes() (not anymore after [1])
> > - __nr_hugepages_store_common (This does not seem to have a deep stack, we could use a normal nodemask_t)
> > 
> > But is also used for NODEMASK_SCRATCH (mainly used for mempolicy):
> > 
> > struct nodemask_scratch {
> > 	nodemask_t	mask1;
> > 	nodemask_t	mask2;
> > };
> > 
> > that would make 256 bytes in case CONFIG_NODES_SHIFT=10.
> 
> And that sole site could use an open-coded kmalloc.

It is not really one single place, but four:

- do_set_mempolicy()
- do_mbind()
- kernel_migrate_pages()
- mpol_shared_policy_init()

They get called in:

- do_set_mempolicy()
	- From set_mempolicy syscall
	- From numa_policy_init()
	- From numa_default_policy()

	* All above do not look like they have a deep stack, so it should
	  be possible to get rid of NODEMASK_SCRATCH there.

- do_mbind
	- From mbind syscall

	* Should be feasible here as well.

- kernel_migrate_pages()

	- From migrate_pages syscall
	
	* Again, this should be doable.

- mpol_shared_policy_init()

	- From hugetlbfs_alloc_inode()
	- shmem_get_inode()
	
	* Seems doable for hugetlbfs_alloc_inode as well. 
	  I only got to check hugetlbfs_alloc_inode, because shmem_get_inode


So it seems that this can be done in most of the places.
The only tricky function might be mpol_shared_policy_init because of shmem_get_inode.
But in that case, we could use an open-coded kmalloc there.

Thanks
-- 
Oscar Salvador
SUSE L3




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux