On Fri, 25 Oct 2024, Peter Zijlstra wrote: > Extend the futex2 interface to be numa aware. > > When FUTEX2_NUMA is specified for a futex, the user value is extended > to two words (of the same size). The first is the user value we all > know, the second one will be the node to place this futex on. > > struct futex_numa_32 { > u32 val; > u32 node; > }; > > When node is set to ~0, WAIT will set it to the current node_id such > that WAKE knows where to find it. If userspace corrupts the node value > between WAIT and WAKE, the futex will not be found and no wakeup will > happen. > > When FUTEX2_NUMA is not set, the node is simply an extention of the > hash, such that traditional futexes are still interleaved over the > nodes. Would it be possible to follow the NUMA memory policy set up for a task when making these decisions? We may not need a separate FUTEX2_NUMA option. There are supportive functions in mm/mempolicy.c that will yield a node for the futex logic to use. See f.e. linux/include/uapi/mempolicy.h for the types of memory policy that can be set for a task in current->mempolicy. MPOL_DEFAULT get local memory / use system default policy MPOL_INTERLEAVE interleaved over nodes MPOL_BIND use the node specified in the task policy. MPOL_LOCAL get_local_memory etc. You will get a page or objects with the correct node by calling alloc_pages() or kmalloc without GFP_THISNODE. If you just need the node to use then use mempolicy_slab_node() and assign that to the node of the futex. The function will determine which node to use depending on the active memory policy.