On Thu, Nov 17, 2011 at 05:44:10PM +0800, Hu Tao wrote: > This series does mainly two things: > > 1. use cgroup cpuset to manage numa parameters > 2. add a virsh command numatune to allow user to change numa parameters > from command line > > Current numa parameters include nodeset and mode, but these cgroup cpuset > provides don't completely match with them, details: > > params cpuset > ------------------------------------------------------ > nodeset cpuset provides cpuset.mems > mode strict cpuset provides cpuset.mem_hardwall > mode interleave cpuset provices cpuset.memory_spread_* > mode preferred no equivalent. !spread to preferred? This isn't right - there are only 3 existing configs in the XML currently, current 'strict' does not map to mem_hardwall, nor does interleave map to memory_spread AFAICT Currently we have have three different configurations possible for memory with the following semantics mode=strict - allocation is from designated nodes, or fails mode=preferred - allocation is from designated nodes, or falls back to other nodes mode=interleave - allocation is interleaved across designated nodes In cgroups cpuset controller you can set cpuset.mems - memory is allocated from designated nodes, or fails cpuset.mem_exclusive - no other cgroups, except parents, or children can allocation from nos listed in cpuset.mems cpuset.mem_hardwall - no other cgroups are allowed to allocate from the nodes listed in cpuset.mems cpuset.memory_spread* - control allocations of internal kernel data structures IMHO, the last three are not really required for libvirt per VM usage - the management application can trivially decide whether to allow overlapping allocation between VMs without needing to set this kernel tunable. So, if using the cgroups cpuset controller for NUMA, the *only* policy we can implement is mode=strict. We cannot implement mode=preferred or mode=interleave, given the currently available cpuset controls. IMHO, we should thus continue to use libnuma for specifying *all* the policies, however, if mode=strict, then we should *also* apply the policy in the cgroups using cpuset.mems since this will at least allow later tuning of nodemask on the fly. We will have to refuse any attempt to switch between different modes on the fly. Only the nodemask, with mode=strict will be dynamically changable. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list