On Mon, Jan 28, 2013 at 10:23 AM, Osier Yang <jyang@xxxxxxxxxx> wrote: > On 2013年01月29日 00:17, Doug Goldstein wrote: >> >> On Sun, Jan 27, 2013 at 10:46 PM, Osier Yang<jyang@xxxxxxxxxx> wrote: >>> >>> On 2013年01月28日 11:47, Osier Yang wrote: >>>> >>>> >>>> On 2013年01月28日 11:44, Osier Yang wrote: >>>>> >>>>> >>>>> On 2013年01月26日 01:07, Doug Goldstein wrote: >>>>>> >>>>>> >>>>>> On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang@xxxxxxxxxx> wrote: >>>>>>> >>>>>>> >>>>>>> On 2013年01月24日 14:26, Doug Goldstein wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@xxxxxxxxxx> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2013年01月24日 12:11, Doug Goldstein wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@xxxxxxxxxx> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + >>>>>>>>>>> qemu >>>>>>>>>>> 1.2.2 applied on top plus a number of stability patches). Having >>>>>>>>>>> issue >>>>>>>>>>> where my VMs fail to start with the following message: >>>>>>>>>>> >>>>>>>>>>> kvm_init_vcpu failed: Cannot allocate memory >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Smell likes we have problem on setting the NUMA policy (perhaps >>>>>>>>> caused by the incorrect host NUMA topology), given that the system >>>>>>>>> still has enough memory. Or numad (if it's installed) is doing >>>>>>>>> something wrong. >>>>>>>>> >>>>>>>>> Can you see if there is something about the Nodeset used to set >>>>>>>>> the policy in debug log? >>>>>>>>> >>>>>>>>> E.g. >>>>>>>>> >>>>>>>>> % cat libvirtd.debug | grep Nodeset >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Well I don't see anything but its likely because I didn't do >>>>>>>> something >>>>>>>> correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose >>>>>>>> from the command line. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> If the process is in background, it's expected you can't see anything >>>>>>> >>>>>>> >>>>>>> My /etc/libvirt/libvirtd.conf had: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I >>>>>>>> didn't >>>>>>>> get any debug messages. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> log_level=1 has to be set. >>>>>>> >>>>>>> Anyway, let's simply do this: >>>>>>> >>>>>>> % service libvirtd stop >>>>>>> % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug >>>>>>> >>>>>> >>>>>> That's what I was doing, minus the tee just to the console and nothing >>>>>> was coming out. Which is why I added the 1:file:/tmp/libvirtd.log, >>>>>> which also didn't get any debug messages. Turns out this instance must >>>>>> have been built with --disable-debug, >>>>>> >>>>>> All I've got in the log is: >>>>>> >>>>>> # grep -i 'numa' libvirtd.debug >>>>>> 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 : >>>>>> About to run /usr/bin/numad -w 2:2048 >>>>>> 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : >>>>>> Nodeset returned from numad: 1 >>>>> >>>>> >>>>> >>>>> This looks right. >>>>> >>>>>> >>>>>> Immediately below that is >>>>>> >>>>>> 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 : >>>>>> Setting up domain cgroup (if required) >>>>>> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New >>>>>> group /libvirt/qemu/bb-2.6.35.9-i686 >>>>>> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : >>>>>> Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in >>>>>> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : >>>>>> Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 : >>>>>> Make group /libvirt/qemu/bb-2.6.35.9-i686 >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : >>>>>> Make controller /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/ >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : >>>>>> Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/ >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:469 >>>>>> : Setting up inheritance /libvirt/qemu -> >>>>>> /libvirt/qemu/bb-2.6.35.9-i686 >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : >>>>>> Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >>>>>> fd 39 >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 >>>>>> : Inherit cpuset.cpus = 0-63 >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : >>>>>> Set value >>>>>> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus' >>>>>> to '0-63' >>>>> >>>>> >>>>> >>>>> This looks not right, it should be 0-7 instead. >>>>> >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >>>>>> fd 39 >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : >>>>>> Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >>>>>> fd 39 >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 >>>>>> : Inherit cpuset.mems = 0-7 >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : >>>>>> Set value >>>>>> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' >>>>>> to '0-7' >>>>> >>>>> >>>>> >>>>> This is right. >>>>> >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >>>>>> fd 39 >>>>>> 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 : >>>>>> Could not autoset a RSS limit for domain bb-2.6.35.9-i686 >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : >>>>>> Set value >>>>>> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' >>>>>> to '1' >>>>> >>>>> >>>>> >>>>> And it's strange that the cpuset.mems is changed to '1' here. >>> >>> >>> >>> Oh, actually this is right, cpuset.mems is about the memory nodes. >>> >>> >>>>> >>>>>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >>>>>> fd 39 >>>>>> >>>>>> Could the RSS issue be related? Some kernel related option not playing >>>>>> nice or enabled? >>>> >>>> >>>> >>>> Instead, I'm wondering if the problem is caused by the mismatch >>>> (from libvirt p.o.v) between cpuset.cpus and cpuset.mems, which >>>> thus cause the problem for kernel memory management? >>> >>> >>> >>> So, the simple method to prove the guess is to use static placement >>> like: >>> >>> <vcpu placement='static' cpuset='0-63'>2</vcpu> >>> <numatune> >>> <memory placement='static' nodeset='1'/> >>> </numatune> >>> >>> Osier >> >> >> Same error. Which I don't know if you expected or didn't expect. >> > > It's expected. as "0-63" is the final result when using "auto" > placement. Since there's another user on the libvirt-list asking about the exact same CPU I've got, I figured I'd do some poking. Oddly enough him and I had different outputs from virsh nodeinfo. Just as background its AMD 6272 CPUs. I've for 4 of them in the box but they're organized as follows: Sockets: 4 Cores: 16 Threads: 1 per core (16) NUMA nodes: 8 Mem per node: 16GB Total: 128GB # virsh nodeinfo CPU model: x86_64 CPU(s): 64 CPU frequency: 2100 MHz CPU socket(s): 1 Core(s) per socket: 64 Thread(s) per core: 1 NUMA cell(s): 1 Memory size: 132013200 KiB # virsh capabilities <snip> <topology sockets='1' cores='64' threads='1'/> <snip> <topology> <cells num='8'> <snip> I've hand verified all the values in /sys/devices/system/nodeX/cpuX/topology/physical_package_id to show that the physical package is oriented in pairs (0&1, 2&3, 4&5, 6&7) for the NUMA nodes. Need to give git a whirl as I know that's got a bit different code than 1.0.1 but I'll report back. -- Doug Goldstein _______________________________________________ libvirt-users mailing list libvirt-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvirt-users