Re: [RFC PATCH] mm/mempolicy: add MPOL_PREFERRED_STRICT memory policy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/14/21 15:08, Michal Hocko wrote:
On Thu 14-10-21 15:00:22, Aneesh Kumar K.V wrote:
Michal Hocko <mhocko@xxxxxxxx> writes:

On Wed 13-10-21 18:53:55, Aneesh Kumar K.V wrote:
On 10/13/21 18:46, Andi Kleen wrote:

The difference with MPOL_BIND is the ability to specify a preferred node
which is the first node in the nodemask argument passed.

That's always the one with the lowest number. Isn't that quite limiting
in practice?

It seems if you really want to do that you would need another argument.

Yes. But that would make it a new syscall. Should we do that?

Yes, I do not see any reasonable to cram this into the existing syscall.
I am not yet sure what the syscall should look like though. I can see
two usecases, one of the is a very specific node allocation fallback
order requirement and another one is preferrence for a cpu less node
over other nodes. Both are slightly different.

How about

SYSCALL_DEFINE5(preferred_mbind, unsigned long, start, unsigned long, len,
		unsigned long, preferred_node, const unsigned long __user *, nmask,
		unsigned long, maxnode)
{
	return kernel_mbind(start, len, MPOL_PREFERRED_STRICT, preferred_node,
			    nmask, maxnode, 0);
}

Semantic? How does it interact with MPOL_PREFERRED_MANY, MPOL_BIND and
other others?


This allows to specify a new memory policy for the va range. We are forced to use a new syscall because of the limitation of the current mbind(2) syscall. We could make a generic sys_mbind2(), but i was not sure whether we need to make it that complex. mbind() is already a 6 argument syscall.

Besides that it would be really great to finish the discussion about the
usecase before suggesting a new userspace API.


Application would like to hint a preferred node for allocating memory backing a va range and at the same time wants to avoid fallback to some set of nodes (in the use case I am interested don't fall back to slow memory nodes).


-aneesh




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux