On 09/04/2012 07:25 PM, Daniel P. Berrange wrote: > On Tue, Sep 04, 2012 at 04:45:16PM +0800, Tang Chen wrote: >> It seems that libvirt is not cpu hotplug aware. >> Please refer to the following problem. >> >> 1. At first, we have 2 cpus. >> # cat /cgroup/cpuset/cpuset.cpus >> 0-1 >> # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus >> 0-1 >> >> 2. And we have a vm1 with following configuration. >> <cputune> >> <vcpupin vcpu='0' cpuset='1'/> >> <emulatorpin cpuset='1'/> >> </cputune> >> >> 3. Offline cpu1. >> # echo 0 > /sys/devices/system/cpu/cpu1/online >> # cat /sys/devices/system/cpu/cpu1/online >> 0 >> # cat /cgroup/cpuset/cpuset.cpus >> 0 >> # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus >> 0 >> # cat /cgroup/cpuset/libvirt/lxc/cpuset.cpus >> 0 >> >> 4. Online cpu1. >> # echo 1 > /sys/devices/system/cpu/cpu1/online >> # cat /sys/devices/system/cpu/cpu1/online >> 1 >> # cat /cgroup/cpuset/cpuset.cpus >> 0-1 >> # cat /cgroup/cpuset/libvirt/cpuset.cpus >> 0 >> # cat /cgroup/cpuset/libvirt/qemu/cpuset.cpus >> 0 >> # cat /cgroup/cpuset/libvirt/lxc/cpuset.cpus >> 0 >> >> Here,cgroup updated cpuset.cpus,but not for libvirt directory,and also qemu and lxc directory. > > I'm rather inclined to say this is the kernel's fault. This is > the same class of problem that we save with S3/S4 kernel support > where the cpuset got blanked out. > > The kernel should *not* be altering the user specified cgroups > settings when offlining CPUs. The problem is that the kernel is > not distinguishing between the user requested cpuset mask and > the mask of available CPUs - it has overloaded both into one > config file. > > The cgroup cpuset.cpus should only reflect the user config. > The kernel should privately AND this with the current mask > of CPUs which actually exist. > I had posted a Linux kernel patchset[1] some time ago to expose another file so that we can distinguish between the user specified settings vs the actual scenario underneath. But the conclusion in the ensuing discussion was that the existing kernel behaviour is good as is, and trying to "fix" it would break kernel semantics. (However, note that the suspend/resume case has been fixed in the kernel by commit d35be8bab). [1]. http://thread.gmane.org/gmane.linux.documentation/4805 Regards, Srivatsa S. Bhat >> vm1 cannot be started again. >> # virsh start vm1 >> error: Failed to start domain vm1 >> error: Unable to set cpuset.cpus: Permission denied >> >> And libvird gave the following errors. >> 2012-07-17 07:30:22.478+0000: 3118: error : qemuSetupCgroupVcpuPin:498 : Unable to set cpuset.cpus: Permission denied >> >> >> These patches resolves this problem by listening on the netlink for cpu hotplug event. >> When the netlink service gets the cpu hotplug event, it will attract the cpuid in the message, >> and add it into cpuset.cpus in: >> /cgroup/cpuset/libvirt >> /cgroup/cpuset/libvirt/qemu >> /cgroup/cpuset/libvirt/lxc > > I don't think we should be doing this. eg, Consisder the host has 8 cpus an > the admin explicitly configured libvirt to only use cpus 1-4. If the host > admin onlines CPU 6, then libvirt should not be adding CPU 6 into its cpuset. > > In addition we cannot assume that the 'libvirt' cgroup is immediately below > the root cgroup. There might be several other layers in the hierarchy above > which also loose their correct cpuset data, not to mention the cgroups of > all other apps in the system. This is a system-wide flaw, but your patch > is only addressing the libvirt impact. So I don't think we should be doing > this. The kernel should fix cgroups properly so all apps work correctly. > > Daniel > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list