Am Donnerstag, 19. September 2013, 12:33:21 schrieb Daniel P. Berrange: > On Thu, Sep 19, 2013 at 01:26:52PM +0200, David Weber wrote: > > Am Mittwoch, 11. September 2013, 11:27:30 schrieb Daniel P. Berrange: > > > On Wed, Sep 11, 2013 at 10:47:08AM +0200, David Weber wrote: > > > > Am Freitag, 6. September 2013, 12:10:04 schrieb Daniel P. Berrange: > > > > > On Tue, Aug 27, 2013 at 09:09:25AM +0200, David Weber wrote: > > > > > > Hi, > > > > > > > > > > > > we try to use vcpu pinning on a 2 socket server with Intel Xeon > > > > > > E5620 > > > > > > cpus, HT enabled and 2*6*16GiB Ram but experience problems if we > > > > > > try > > > > > > to > > > > > > start a guest on the second socket: > > > > > > error: Failed to start domain test > > > > > > error: internal error: process exited while connecting to monitor: > > > > > > kvm_init_vcpu failed: Cannot allocate memory > > > > > > > > # virsh freecell 0 > > > > 0: 86071624 KiB > > > > > > > > # virsh freecell 1 > > > > 1: 75258628 KiB > > > > > > > > # virsh edit test > > > > <domain type='kvm'> > > > > > > > > <name>test</name> > > > > <uuid>08cdc389-78bf-450c-89f4-b4728edabdbf</uuid> > > > > <memory unit='KiB'>1048576</memory> > > > > <currentMemory unit='KiB'>1048576</currentMemory> > > > > <vcpu placement='static' cpuset='4-7'>1</vcpu> > > > > <numatune> > > > > > > > > <memory mode='strict' nodeset='1'/> > > > > > > > > </numatune> > > > > <os> > > > > > > > > <type arch='x86_64' machine='pc-i440fx-1.5'>hvm</type> > > > > <boot dev='hd'/> > > > > > > > > </os> > > > > <features> > > > > > > > > <acpi/> > > > > <apic/> > > > > <pae/> > > > > > > > > </features> > > > > <clock offset='utc'/> > > > > <on_poweroff>destroy</on_poweroff> > > > > <on_reboot>restart</on_reboot> > > > > <on_crash>restart</on_crash> > > > > <devices> > > > > > > > > <emulator>/usr/bin/qemu-kvm</emulator> > > > > <controller type='usb' index='0'> > > > > > > > > <address type='pci' domain='0x0000' bus='0x00' slot='0x01' > > > > > > > > function='0x2'/> > > > > > > > > </controller> > > > > <controller type='pci' index='0' model='pci-root'/> > > > > <controller type='ide' index='0'> > > > > > > > > <address type='pci' domain='0x0000' bus='0x00' slot='0x01' > > > > > > > > function='0x1'/> > > > > > > > > </controller> > > > > <input type='mouse' bus='ps2'/> > > > > <graphics type='vnc' port='-1' autoport='yes'/> > > > > <video> > > > > > > > > <model type='cirrus' vram='9216' heads='1'/> > > > > <address type='pci' domain='0x0000' bus='0x00' slot='0x02' > > > > > > > > function='0x0'/> > > > > > > > > </video> > > > > <memballoon model='virtio'> > > > > > > > > <address type='pci' domain='0x0000' bus='0x00' slot='0x03' > > > > > > > > function='0x0'/> > > > > > > > > </memballoon> > > > > > > > > </devices> > > > > > > > > </domain> > > > > > > > > # virsh start test > > > > > > > > error: Failed to start domain test > > > > error: internal error: process exited while connecting to monitor: > > > > kvm_init_vcpu failed: Cannot allocate memory > > > > > > > > Allocating memory on this node with numactl works fine > > > > # numactl --cpubind=1 --membind=1 -- dd if=/dev/zero of=/dev/null > > > > bs=2G > > > > count=1 > > > > 0+1 records in > > > > 0+1 records out > > > > 2147479552 bytes (2.1 GB) copied, 0.60816 s, 3.5 GB/s > > > > > > Hmm, this makes no sense at all to me. Your configuration looks totally > > > valid and you have plenty of memory in both nodes. > > > > After reading a bit more about cgroups, I now think I know whats going on. > > > > Lets assume we have a 2 node dualcore system and start a guest named > > 'test' > > without cpu or memory pinning. > > > > * libvirt creates a controller under cpuset/machine/test.libvirt-qemu: > > cpuset/machine/test.libvirt-qemu/cpuset.cpus -> 0-3 > > cpuset/machine/test.libvirt-qemu/cpuset.mems -> 0-1 > > * libvirt creates a controller for every vcpu: > > cpuset/machine/test.libvirt-qemu/vcpu*/cpuset.cpus -> 0-3 > > cpuset/machine/test.libvirt-qemu/vcpu*/cpuset.mems -> 0-1 > > * libvirt creates a controller for qemu: > > cpuset/machine/test.libvirt-qemu/emulator/cpuset.cpus -> 0-3 > > cpuset/machine/test.libvirt-qemu/emulator/cpuset.mems -> 0-1 > > > > Now we want to pin the guest to the second node > > virsh # numatune test --nodeset 1 > > error: Unable to change numa parameters > > error: Unable to write to '/sys/fs/cgroup/cpuset/machine/Ubuntu.libvirt- > > qemu/cpuset.mems': Device or resource busy > > > > What happens is that Libvirt tries to set cpuset/machine/test.libvirt- > > qemu/cpuset.mems to 1 but this is not possible because > > cpuset/machine/test.libvirt-qemu/vcpu*/cpuset.mems and > > cpuset/machine/test.libvirt-qemu/emulator/cpuset.mems still contain 0-1. > > Libvirt has to change these values before! > > Oooh, interesting hypothesis. I wonder if this is a kernel behaviour > change. I'm fairly sure that in the past if you removed a cpu from the > cpuset mask, it would automagicaly purge it from all children. > > Please file a bug about this - it should be possible to make libvirt > do the right thing and purge child masks explicitly first. > Done: https://bugzilla.redhat.com/show_bug.cgi?id=1009880 I have also tested Linux 3.2.51 so the change would have had to happen quite some time ago. Cheers, David -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list