Extend the rseq ABI to expose NUMA node ID, mm_cid, and mm_numa_cid fields. The NUMA node ID field allows implementing a faster getcpu(2) in libc. The per-memory-map concurrency id (mm_cid) [1] allows ideal scaling (down or up) of user-space per-cpu data structures. The concurrency ids allocated within a memory map are tracked by the scheduler, which takes into account the number of concurrently running threads, thus implicitly considering the number of threads, the cpu affinity, the cpusets applying to those threads, and the number of logical cores on the system. The NUMA-aware concurrency id (mm_numa_cid) is similar to the mm_cid, except that it keeps track of the NUMA node ids with which each cid has been associated. On NUMA systems, when a NUMA-aware concurrency ID is observed by user-space to be associated with a NUMA node, it is guaranteed to never change NUMA node unless a kernel-level NUMA configuration change happens. This is useful for NUMA-aware per-cpu data structures running in environments where a process or a set of processes belonging to cpuset are pinned to a set of cores which belong to a subset of the system's NUMA nodes. This series is based on tip/sched/core commit 52b33d87b9197 ("sched/psi: Use task->psi_flags to clear in CPU migration") Thanks, Mathieu [1] was previously known as vcpu_id in earlier versions of this patch set. Mathieu Desnoyers (30): selftests/rseq: Fix: Fail thread registration when CONFIG_RSEQ=n rseq: Introduce feature size and alignment ELF auxiliary vector entries rseq: Introduce extensible rseq ABI rseq: Extend struct rseq with numa node id selftests/rseq: Use ELF auxiliary vector for extensible rseq selftests/rseq: Implement rseq numa node id field selftest sched: Introduce per-memory-map concurrency ID rseq: Extend struct rseq with per-memory-map concurrency ID selftests/rseq: Remove RSEQ_SKIP_FASTPATH code selftests/rseq: Implement rseq mm_cid field support selftests/rseq: x86: Template memory ordering and percpu access mode selftests/rseq: arm: Template memory ordering and percpu access mode selftests/rseq: arm64: Template memory ordering and percpu access mode selftests/rseq: mips: Template memory ordering and percpu access mode selftests/rseq: ppc: Template memory ordering and percpu access mode selftests/rseq: s390: Template memory ordering and percpu access mode selftests/rseq: riscv: Template memory ordering and percpu access mode selftests/rseq: Implement basic percpu ops mm_cid test selftests/rseq: Implement parametrized mm_cid test selftests/rseq: parametrized test: Report/abort on negative concurrency ID tracing/rseq: Add mm_cid field to rseq_update lib: Implement find_{first,next,nth}_notandnot_bit, find_first_andnot_bit cpumask: Implement cpumask_{first,next}_{not,}andnot sched: NUMA-aware per-memory-map concurrency ID rseq: Extend struct rseq with per-memory-map NUMA-aware Concurrency ID selftests/rseq: x86: Implement rseq_load_u32_u32 selftests/rseq: Implement mm_numa_cid accessors in headers selftests/rseq: Implement numa node id vs mm_numa_cid invariant test selftests/rseq: Implement mm_numa_cid tests tracing/rseq: Add mm_numa_cid field to rseq_update fs/binfmt_elf.c | 5 + fs/exec.c | 4 + include/linux/cpumask.h | 60 + include/linux/find.h | 123 +- include/linux/mm.h | 43 + include/linux/mm_types.h | 109 +- include/linux/sched.h | 12 + include/trace/events/rseq.h | 9 +- include/uapi/linux/auxvec.h | 2 + include/uapi/linux/rseq.h | 31 + init/Kconfig | 4 + kernel/fork.c | 11 +- kernel/ptrace.c | 2 +- kernel/rseq.c | 73 +- kernel/sched/core.c | 49 + kernel/sched/sched.h | 192 +++ kernel/signal.c | 2 + lib/find_bit.c | 42 + tools/testing/selftests/rseq/.gitignore | 9 + tools/testing/selftests/rseq/Makefile | 34 +- .../testing/selftests/rseq/basic_numa_test.c | 117 ++ .../selftests/rseq/basic_percpu_ops_test.c | 58 +- tools/testing/selftests/rseq/basic_test.c | 4 + tools/testing/selftests/rseq/compiler.h | 6 + tools/testing/selftests/rseq/param_test.c | 181 ++- tools/testing/selftests/rseq/rseq-abi.h | 31 + tools/testing/selftests/rseq/rseq-arm-bits.h | 505 +++++++ tools/testing/selftests/rseq/rseq-arm.h | 707 +--------- .../testing/selftests/rseq/rseq-arm64-bits.h | 392 ++++++ tools/testing/selftests/rseq/rseq-arm64.h | 532 +------- .../testing/selftests/rseq/rseq-bits-reset.h | 11 + .../selftests/rseq/rseq-bits-template.h | 51 + tools/testing/selftests/rseq/rseq-mips-bits.h | 462 +++++++ tools/testing/selftests/rseq/rseq-mips.h | 652 +-------- tools/testing/selftests/rseq/rseq-ppc-bits.h | 454 +++++++ tools/testing/selftests/rseq/rseq-ppc.h | 629 +-------- .../testing/selftests/rseq/rseq-riscv-bits.h | 410 ++++++ tools/testing/selftests/rseq/rseq-riscv.h | 541 +------- tools/testing/selftests/rseq/rseq-s390-bits.h | 474 +++++++ tools/testing/selftests/rseq/rseq-s390.h | 501 +------ tools/testing/selftests/rseq/rseq-skip.h | 65 - tools/testing/selftests/rseq/rseq-x86-bits.h | 1036 ++++++++++++++ tools/testing/selftests/rseq/rseq-x86.h | 1204 +---------------- tools/testing/selftests/rseq/rseq.c | 91 +- tools/testing/selftests/rseq/rseq.h | 258 +++- .../testing/selftests/rseq/run_param_test.sh | 5 + 46 files changed, 5532 insertions(+), 4661 deletions(-) create mode 100644 tools/testing/selftests/rseq/basic_numa_test.c create mode 100644 tools/testing/selftests/rseq/rseq-arm-bits.h create mode 100644 tools/testing/selftests/rseq/rseq-arm64-bits.h create mode 100644 tools/testing/selftests/rseq/rseq-bits-reset.h create mode 100644 tools/testing/selftests/rseq/rseq-bits-template.h create mode 100644 tools/testing/selftests/rseq/rseq-mips-bits.h create mode 100644 tools/testing/selftests/rseq/rseq-ppc-bits.h create mode 100644 tools/testing/selftests/rseq/rseq-riscv-bits.h create mode 100644 tools/testing/selftests/rseq/rseq-s390-bits.h delete mode 100644 tools/testing/selftests/rseq/rseq-skip.h create mode 100644 tools/testing/selftests/rseq/rseq-x86-bits.h -- 2.25.1