On 07/18/2012 02:58 AM, Peter Krempa wrote: > On 07/18/12 04:01, Eric Blake wrote: >> Commit 80533ca forgot to think about offline cpus. When a node >> cpu is offline, then its topology/ subdirectory is not present, >> leading to spurious error messages leaked to the user such as: >> >> libvir: error : cannot open >> /home/dummy/libvirt/tests/nodeinfodata/linux-nodeinfo-sysfs-test-6/node/node0/cpu7/topology/physical_package_id: >> No such file or directory >> >> Fix that, as well as test it; the test data is gathered from a >> machine with one NUMA node, hyperthreading, and with 2 of the >> 8 cpus offline. >> >> * src/nodeinfo.c (virNodeParseNode): Don't parse topology of >> offline cpus. >> * tests/nodeinfotest.c (mymain): Run new test. >> * tests/nodeinfodata/linux-nodeinfo-sysfs-test-6*: New data. >> --- >> >> Offline cpus are an annoying corner case :) > > Indeed! Who would ever cripple their machine on purpose :) In small-scale use, probably no one (and developers tend to have small-scale setups). Ergo our problems in detecting these sorts of issues. But in large enterprisey setups with beefy machines having lots of NUMA nodes, the power savings for offlining an entire node when the machine is under light load can lead to noticeable cost savings on the power bill; you generally see the best savings when offlining an entire node (the way I did it by offlining cpu5 and cpu7, from two unpaired threads, and since my box only had one node to begin with, probably didn't save any power). When power savings are not the issue, then another common reason for offlining cpus is to temporarily disable hyperthreading (yes, it's a bit more abrupt than cpu pinning, but also a lot faster to set up). Believe it or not, there are workloads that are actually slower when run in parallel on a hyperthread pair than when run serially on a single cpu (that is, hyperthreading is a hardware shortcut; it isn't really two cpus so much as a way to use one cpu to handle two loads, but it only works insofar as the two loads don't stomp on each other's cache, and not all loads meet that property). > > ACK, thanks for finding this. Thanks; pushed. -- Eric Blake eblake@xxxxxxxxxx +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list