[PATCH] fix cpumask ncpus parameter handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey-
	Theres a bug in how numactl handles the building of the mask when
executing the --physbind option.  The call to cpumask in main uses the variable
ncpus as the second parameter to cpumask, and treats it as a number of cpus when
calling numa_sched_setaffinity, as is evidenced by the fvact that it attempts to
convert it to a byte count again with the CPU_BYTES macro.  The problem is that
cpumask already stores the size of the mask in bytes in the ncpus variable
(rounded up to the nearest long, since thats how its allocated).  This just
happens to work for most systems, but we wind up passing in too short a size to
numa_sched_setaffinity if you try to bind to processors greater than 64:

consider the following situation (assuming an x86_64 system):

syconf(NR_CPUS) == 128
ncpus on return from cpumask = 128 / 4 = 32
CPU_BYTES(32) = 32/8 = 4

so 4 is the value passed as cpusetsize (which is expressed in bytes) to
sched_setaffinity.  Even though you will have enough allocated space to set
affinity on all 128 processors in cpumask, sched_setaffinity won't be able to
see any part of the mast beyond the first 32 bits because of this bug.  the
patch bellow corrects this by clarifying the meaning of that variable and
passing it consistently as a number of bytes.  I've tested it and observed it
to work well on a 64 way system.

Thanks & Regards
Neil

Signed-off-by: Neil Horman <nhorman@xxxxxxxxxxxxx>


 numactl.c |    6 +++---
 util.c    |    4 ++--
 util.h    |    2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)


diff -up numactl-1.0.2/numactl.c.orig numactl-1.0.2/numactl.c
--- numactl-1.0.2/numactl.c.orig	2007-09-21 06:23:51.000000000 -0400
+++ numactl-1.0.2/numactl.c	2008-04-24 15:22:44.000000000 -0400
@@ -355,14 +355,14 @@ int main(int ac, char **av)
 			break;
 		case 'C': /* --physcpubind */
 		{
-			int ncpus;
+			int bufsz;
 			unsigned long *cpubuf;
 			dontshm("-C/--physcpubind");
-			cpubuf = cpumask(optarg, &ncpus);
+			cpubuf = cpumask(optarg, &bufsz);
 			errno = 0;
 			check_cpubind(do_shm);
 			did_cpubind = 1;
-			numa_sched_setaffinity(0, CPU_BYTES(ncpus), cpubuf);
+			numa_sched_setaffinity(0, bufsz, cpubuf);
 			checkerror("sched_setaffinity");
 			free(cpubuf);
 			break;
diff -up numactl-1.0.2/util.h.orig numactl-1.0.2/util.h
--- numactl-1.0.2/util.h.orig	2007-08-16 10:36:23.000000000 -0400
+++ numactl-1.0.2/util.h	2008-04-24 15:22:44.000000000 -0400
@@ -1,7 +1,7 @@
 extern void printmask(char *name, nodemask_t *mask);
 extern void printcpumask(char *name, unsigned long *mask, int len);
 extern nodemask_t nodemask(char *s);
-extern unsigned long *cpumask(char *s, int *ncpus);
+extern unsigned long *cpumask(char *s, int *bufsz);
 extern int read_sysctl(char *name);
 extern void complain(char *fmt, ...);
 extern void nerror(char *fmt, ...);
diff -up numactl-1.0.2/util.c.orig numactl-1.0.2/util.c
--- numactl-1.0.2/util.c.orig	2008-04-24 15:23:02.000000000 -0400
+++ numactl-1.0.2/util.c	2008-04-24 15:23:17.000000000 -0400
@@ -52,7 +52,7 @@ void printmask(char *name, nodemask_t *m
 int numcpus; 
 
 /* caller must free buffer */
-unsigned long *cpumask(char *s, int *ncpus) 
+unsigned long *cpumask(char *s, int *bufsz) 
 {
 	int invert = 0;
 	char *end; 
@@ -110,7 +110,7 @@ unsigned long *cpumask(char *s, int *ncp
 				set_bit(i, cpubuf);
 		}
 	} 
-	*ncpus = cpubufsize;
+	*bufsz = cpubufsize;
 	return cpubuf;	
 }
 
-- 
/****************************************************
 * Neil Horman <nhorman@xxxxxxxxxxxxx>
 * Software Engineer, Red Hat
 ****************************************************/
--
To unsubscribe from this list: send the line "unsubscribe linux-numa" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]     [Devices]

  Powered by Linux