Hi
We observed that numa_node_to_cpus api() api converts a node number to a bitmask of
CPUs. The user must pass a long enough buffer. If the buffer is not long enough
errno will be set to ERANGE and -1 returned. On success 0 is returned.
This api has been changed in numa version 2.0. It has new implementation (_v2)
Analysis:
Now within the numa_node_to_cpus code there is a check if the size of buffer
passed from the user matches the one returned by the sched_getaffinity. This
check fails and hence we see "map size mismatch: abort" messages coming out on
console. My system has 4 node and 8 CPUs.
------------------------------------------------------------------------------------
Testcase to reproduce the problem
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <numa.h>
typedef unsigned long BUF[64];
int numa_exit_on_error = 0;
void node_to_cpus(void)
{
int i;
BUF cpubuf;
BUF affinityCPUs;
int maxnode = numa_max_node();
printf("available: %d nodes (0-%d)\n", 1+maxnode, maxnode);
for (i = 0; i <= maxnode; i++) {
printf("Calling numa_node_to_cpus()\n");
printf("Size of BUF is : %d \n",sizeof(BUF));
if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
printf("Calling numa_node_to_cpus() again \n");
if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
} else {
printf("Got < 0 \n");
numa_error("numa_node_to_cpu");
numa_exit_on_error = 1;
exit(numa_exit_on_error);
}
} else {
numa_error("numa_node_to_cpu 0");
numa_exit_on_error = 1;
exit(numa_exit_on_error);
}
}
}
int main()
{
void node_to_cpus();
if (numa_available() < 0)
{
printf("This system does not support NUMA policy\n");
numa_error("numa_available");
numa_exit_on_error = 1;
exit(numa_exit_on_error);
}
node_to_cpus();
return numa_exit_on_error;
}
------------------------------------------------------------------------------------
Problem Fix:
The fix is to allow numa_node_to_cpus_v2() to fail only when the supplied buffer is
smaller than the bitmask required to represent online NUMA nodes.
Attaching the patch to address this issues, patch is generated against numactl-2.0.4-rc1
Regards
Yeehaw
Index: numactl-2.0.4-rc1/libnuma.c
===================================================================
--- numactl-2.0.4-rc1.orig/libnuma.c 2009-12-16 02:48:26.000000000 +0530
+++ numactl-2.0.4-rc1/libnuma.c 2010-01-27 17:06:30.000000000 +0530
@@ -1272,7 +1272,7 @@
if (node_cpu_mask_v2[node]) {
/* have already constructed a mask for this node */
- if (buffer->size != node_cpu_mask_v2[node]->size) {
+ if (buffer->size < node_cpu_mask_v2[node]->size) {
numa_error("map size mismatch; abort\n");
return -1;
}