On Thu, Apr 24, 2008 at 04:09:12PM -0500, Cliff Wickman wrote: > Hi Neil, > > Thanks for the fix. > I see that it applies to numactl-1.0.2 > > This is handled quite differently in the 2.0.0 release candidate. > http://oss.sgi.com/projects/libnuma/ > ftp://oss.sgi.com/www/projects/libnuma/download > > The 2.0.0 version uses variable-length bit masks for cpus and > instead of nodemask_t for nodes. > > Care to test whether the new version passes your test case? > > -Cliff In a follow up to my previous post, I tracked down that bug involving Cpus_allowed field of proc_pid_status. As it turns out its not a kernel bug after all, but rather a subtlety of what it reports. As it turns out (and as one might expect), cpus_allowed reports the cpu which the associated task is permitted to run on. However, there are cases in which tasks are allowed to run on all available cpus, and in some of these cases the kernel will set the Cpus_allowed field to CPU_MASK_ALL, which boils down to a bitmask of all 1's, NR_CPUS bits long. Since NR_CPUS's is statically defined in smp kernels to be a large number defining the maximum number of cpus a kernel can manage, its possible for the mask in /proc/<pid>/status[Cpus_allowed] to be a superset of the actual available cpus. This can lead to sched_setafinity returning EINVAL even if your physcpubind parsing completed successfully. This patch should correct the problem, by limiting the Cpus_allowed mask to the number of bits implied by sysconf(_SC_NPROCESSORS_CONF). Tested successfully by me. Regards Neil Signed-off-by: Neil Horman <nhorman@xxxxxxxxxxxxx> libnuma.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff -up numactl-2.0.0-rc1/libnuma.c.orig numactl-2.0.0-rc1/libnuma.c --- numactl-2.0.0-rc1/libnuma.c.orig 2008-04-25 09:33:01.000000000 -0400 +++ numactl-2.0.0-rc1/libnuma.c 2008-04-25 09:52:19.000000000 -0400 @@ -450,6 +450,7 @@ set_thread_constraints(void) int buflen; char *buffer; FILE *f; + int ncpumask = (1<<(sysconf(_SC_NPROCESSORS_CONF)))-1; /* * The maximum line size consists of the string at the beginning plus * a digit for each 4 cpus and a comma for each 64 cpus. @@ -480,10 +481,22 @@ set_thread_constraints(void) fclose(f); free (buffer); - if (maxprocnode < 0) { + /* + * Cps_allowed in the kenrel can be defined to all f's + * i.e. it may be a superset of the actual available processors + * as such lets reduce maxproccpu with a mask of the actual + * available cpus + */ + maxproccpu &= ncpumask; + + /* + * Sanity checks + */ + if (maxproccpu == 0) + numa_warn(W_cpumap, "Available cpus are empty set"); + + if (maxprocnode < 0) numa_warn(W_cpumap, "Cannot parse %s", mask_size_file); - return; - } return; } -- /**************************************************** * Neil Horman <nhorman@xxxxxxxxxxxxx> * Software Engineer, Red Hat ****************************************************/ -- To unsubscribe from this list: send the line "unsubscribe linux-numa" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html