On Sun, Apr 25, 2010 at 01:21:04PM -0400, Jon Stanley wrote: > When trying to discover the NUMA topology of a system in a script, for > example. lscpu -p seems the perfect format to do so. However, it's not > accurate, because the CPU identifiers given in the first column bear > no relation whatsoever to the actual CPU's as seen by the OS. I can confirm this bug. It's reproducible with data from your /sys. > I looked into the source code, and it appears that lscpu actually > throws away all of the NUMA topology information that it gets. If I > read the source code correctly, then it counts the number of nodes, > cores, and sockets by looking only at cpu0, and simply says "OK, > there's 4 cores/socket here, let's call them 0-3 and 4-7, regardless > of what the OS thinks they are". Not entirely useful. Yes. Unfortunately, there are more problems. The worst problem is how lscpu "works" with cpu masks. It cares about number of set bits only. So it has information about number of CPUs for the node (or cache, etc.), but there are not details which CPU belongs to the node ;-( On Mon, Apr 26, 2010 at 10:57:10PM -0400, Jon Stanley wrote: > This is from a Dell R710, 8-core Nehalem w/HT, running RHEL 5.5. It > looks like all of the sysfs components required for lscpu to work are > there, but as you can see from the attached tarball, this output of > lscpu -p doesn't bear any resemblance to the actual topology of the > box: > > [root@etc752754a sys-utils]# ./lscpu -p > # The following is the parsable format, which can be fed to other > # programs. Each different item in every column has an unique ID > # starting from zero. > # CPU,Core,Socket,Node,,L1d,L1i,L2,L3 > 0,0,0,0,,0,0,0,0 > 1,0,0,0,,0,0,0,0 > 2,1,0,0,,1,1,1,0 > 3,1,0,0,,1,1,1,0 > 4,2,0,0,,2,2,2,0 > 5,2,0,0,,2,2,2,0 > 6,3,0,0,,3,3,3,0 > 7,3,0,0,,3,3,3,0 > 8,4,1,1,,4,4,4,1 > 9,4,1,1,,4,4,4,1 > 10,5,1,1,,5,5,5,1 > 11,5,1,1,,5,5,5,1 > 12,6,1,1,,6,6,6,1 > 13,6,1,1,,6,6,6,1 > 14,7,1,1,,7,7,7,1 > 15,7,1,1,,7,7,7,1 $ cat devices/system/node/node0/cpumap 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00005555 $ cat devices/system/node/node1/cpumap 00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000aaaa It means node0 are 0,2,4,6,8,10,12,14 CPUs and node1 are 1,3,5,7,9,11,13,15 CPUs. The same problem is there with caches, cores, ... pretty visible is it for L2 cache: for i in $(cat nehalem/sys/devices/system/cpu/cpu*/cache/index2/shared_cpu_map | awk -F ',' '{print $8}'); do ~kzak/cpumask --mask-to-str $i; done 00000101 = 0,8 00000404 = 2,10 00000808 = 3,11 00001010 = 4,12 00002020 = 5,13 00004040 = 6,14 00008080 = 7,15 00000202 = 1,9 00000404 = 2,10 00000808 = 3,11 00001010 = 4,12 00002020 = 5,13 00004040 = 6,14 00008080 = 7,15 00000101 = 0,8 00000202 = 1,9 (the cpumask is my small test program) > Interestingly, lstopo on this same box produces the correct results > (note the phys= items for the PU's to tell me which CPU's the OS sees > this as being): > > [root@etc752754a utils]# ./lstopo > Machine (142GB) > NUMANode #0 (phys=0 71GB) + Socket #0 + L3 #0 (8192KB) > L2 #0 (256KB) + L1 #0 (32KB) + Core #0 > PU #0 (phys=0) > PU #1 (phys=8) yes, lstopo is right, L2 #0 is shared between #0 and #8 CPUs. > But since lstopo requires us to download and compile something else, > and lscpu will be on every box everywhere as part of util-linux-ng, > then we'd obviously prefer that lscpu worked correctly. I'll fix it. We have to use cpu masks correctly in the lscpu(1). Karel -- Karel Zak <kzak@xxxxxxxxxx> http://karelzak.blogspot.com -- To unsubscribe from this list: send the line "unsubscribe util-linux-ng" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html