On Tue, Sep 04, 2012 at 04:45:16PM +0800, Tang Chen wrote: > It seems that libvirt is not cpu hotplug aware. > Please refer to the following problem. > > 1. At first, we have 2 cpus. > # cat /cgroup/cpuset/cpuset.cpus > 0-1 > # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus > 0-1 > > 2. And we have a vm1 with following configuration. > <cputune> > <vcpupin vcpu='0' cpuset='1'/> > <emulatorpin cpuset='1'/> > </cputune> > > 3. Offline cpu1. > # echo 0 > /sys/devices/system/cpu/cpu1/online > # cat /sys/devices/system/cpu/cpu1/online > 0 > # cat /cgroup/cpuset/cpuset.cpus > 0 > # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus > 0 > # cat /cgroup/cpuset/libvirt/lxc/cpuset.cpus > 0 > > 4. Online cpu1. > # echo 1 > /sys/devices/system/cpu/cpu1/online > # cat /sys/devices/system/cpu/cpu1/online > 1 > # cat /cgroup/cpuset/cpuset.cpus > 0-1 > # cat /cgroup/cpuset/libvirt/cpuset.cpus > 0 > # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus > 0 > # cat /cgroup/cpuset/libvirt/lxc/cpuset.cpus > 0 > > Here,cgroup updated cpuset.cpus,but not for libvirt directory,and also qemu and lxc directory. I'm rather inclined to say this is the kernel's fault. This is the same class of problem that we save with S3/S4 kernel support where the cpuset got blanked out. The kernel should *not* be altering the user specified cgroups settings when offlining CPUs. The problem is that the kernel is not distinguishing between the user requested cpuset mask and the mask of available CPUs - it has overloaded both into one config file. The cgroup cpuset.cpus should only reflect the user config. The kernel should privately AND this with the current mask of CPUs which actually exist. > vm1 cannot be started again. > # virsh start vm1 > error: Failed to start domain vm1 > error: Unable to set cpuset.cpus: Permission denied > > And libvird gave the following errors. > 2012-07-17 07:30:22.478+0000: 3118: error : qemuSetupCgroupVcpuPin:498 : Unable to set cpuset.cpus: Permission denied > > > These patches resolves this problem by listening on the netlink for cpu hotplug event. > When the netlink service gets the cpu hotplug event, it will attract the cpuid in the message, > and add it into cpuset.cpus in: > /cgroup/cpuset/libvirt > /cgroup/cpuset/libvirt/qemu > /cgroup/cpuset/libvirt/lxc I don't think we should be doing this. eg, Consisder the host has 8 cpus an the admin explicitly configured libvirt to only use cpus 1-4. If the host admin onlines CPU 6, then libvirt should not be adding CPU 6 into its cpuset. In addition we cannot assume that the 'libvirt' cgroup is immediately below the root cgroup. There might be several other layers in the hierarchy above which also loose their correct cpuset data, not to mention the cgroups of all other apps in the system. This is a system-wide flaw, but your patch is only addressing the libvirt impact. So I don't think we should be doing this. The kernel should fix cgroups properly so all apps work correctly. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list