Re: VMs fail to start with NUMA configuration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[ CC Peter ]

On 2013年01月31日 06:01, Doug Goldstein wrote:
On Wed, Jan 30, 2013 at 1:21 AM, Wayne Sun<gsun@xxxxxxxxxx>  wrote:
On 01/30/2013 01:25 PM, Doug Goldstein wrote:

On Mon, Jan 28, 2013 at 10:23 AM, Osier Yang<jyang@xxxxxxxxxx>   wrote:

On 2013年01月29日 00:17, Doug Goldstein wrote:

On Sun, Jan 27, 2013 at 10:46 PM, Osier Yang<jyang@xxxxxxxxxx>    wrote:

On 2013年01月28日 11:47, Osier Yang wrote:


On 2013年01月28日 11:44, Osier Yang wrote:


On 2013年01月26日 01:07, Doug Goldstein wrote:


On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang@xxxxxxxxxx>
wrote:


On 2013年01月24日 14:26, Doug Goldstein wrote:



On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@xxxxxxxxxx>
wrote:



On 2013年01月24日 12:11, Doug Goldstein wrote:




On Wed, Jan 23, 2013 at 3:45 PM, Doug
Goldstein<cardoe@xxxxxxxxxx>
wrote:




I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0
+
qemu
1.2.2 applied on top plus a number of stability patches).
Having
issue
where my VMs fail to start with the following message:

kvm_init_vcpu failed: Cannot allocate memory





Smell likes we have problem on setting the NUMA policy (perhaps
caused by the incorrect host NUMA topology), given that the
system
still has enough memory. Or numad (if it's installed) is doing
something wrong.

Can you see if there is something about the Nodeset used to set
the policy in debug log?

E.g.

% cat libvirtd.debug | grep Nodeset




Well I don't see anything but its likely because I didn't do
something
correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose
from the command line.




If the process is in background, it's expected you can't see
anything


My /etc/libvirt/libvirtd.conf had:



log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I
didn't
get any debug messages.




log_level=1 has to be set.

Anyway, let's simply do this:

% service libvirtd stop
% LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug

That's what I was doing, minus the tee just to the console and
nothing
was coming out. Which is why I added the 1:file:/tmp/libvirtd.log,
which also didn't get any debug messages. Turns out this instance
must
have been built with --disable-debug,

All I've got in the log is:

# grep -i 'numa' libvirtd.debug
2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 :
About to run /usr/bin/numad -w 2:2048
2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 :
Nodeset returned from numad: 1



This looks right.

Immediately below that is

2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 :
Setting up domain cgroup (if required)
2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New
group /libvirt/qemu/bb-2.6.35.9-i686
2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 :
Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in
2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 :
Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in
2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 :
Make group /libvirt/qemu/bb-2.6.35.9-i686
2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 :
Make controller
/sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/
2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 :
Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/
2013-01-25 16:50:17.296+0000: 417: debug :
virCgroupCpuSetInherit:469
: Setting up inheritance /libvirt/qemu ->
/libvirt/qemu/bb-2.6.35.9-i686
2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361
:
Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed
fd 39
2013-01-25 16:50:17.296+0000: 417: debug :
virCgroupCpuSetInherit:482
: Inherit cpuset.cpus = 0-63
2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331
:
Set value
'/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus'
to '0-63'



This looks not right, it should be 0-7 instead.

2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed
fd 39
2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361
:
Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed
fd 39
2013-01-25 16:50:17.296+0000: 417: debug :
virCgroupCpuSetInherit:482
: Inherit cpuset.mems = 0-7
2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331
:
Set value
'/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems'
to '0-7'



This is right.

2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed
fd 39
2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 :
Could not autoset a RSS limit for domain bb-2.6.35.9-i686
2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331
:
Set value
'/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems'
to '1'



And it's strange that the cpuset.mems is changed to '1' here.



Oh, actually this is right, cpuset.mems is about the memory nodes.


2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed
fd 39

Could the RSS issue be related? Some kernel related option not
playing
nice or enabled?



Instead, I'm wondering if the problem is caused by the mismatch
(from libvirt p.o.v) between cpuset.cpus and cpuset.mems, which
thus cause the problem for kernel memory management?



So, the simple method to prove the guess is to use static placement
like:

<vcpu placement='static' cpuset='0-63'>2</vcpu>
<numatune>
     <memory placement='static' nodeset='1'/>
</numatune>

Osier


Same error. Which I don't know if you expected or didn't expect.

It's expected. as "0-63" is the final result when using "auto"
placement.

Since there's another user on the libvirt-list asking about the exact
same CPU I've got, I figured I'd do some poking. Oddly enough him and
I had different outputs from virsh nodeinfo. Just as background its
AMD 6272 CPUs. I've for 4 of them in the box but they're organized as
follows:

Sockets: 4
Cores: 16
Threads: 1 per core (16)
NUMA nodes: 8
Mem per node: 16GB
Total: 128GB

# virsh nodeinfo
CPU model:           x86_64
CPU(s):              64
CPU frequency:       2100 MHz
CPU socket(s):       1
Core(s) per socket:  64
Thread(s) per core:  1
NUMA cell(s):        1
Memory size:         132013200 KiB

# virsh capabilities
<snip>
        <topology sockets='1' cores='64' threads='1'/>
<snip>
      <topology>
        <cells num='8'>
<snip>

I've hand verified all the values in
/sys/devices/system/nodeX/cpuX/topology/physical_package_id to show
that the physical package is oriented in pairs (0&1, 2&3, 4&5, 6&7)
for the NUMA nodes.

Need to give git a whirl as I know that's got a bit different code
than 1.0.1 but I'll report back.

As far as I see, Peter committed more patches to fix the CPU toplogy
parsing on AMD platfroms. Perhaps he will known if this is fixed
in new release.


For AMD 62xx CPUs, the output is expected.

Check out this bug:
virsh nodeinfo can't get the right info on AMD Bulldozer cpu
https://bugzilla.redhat.com/show_bug.cgi?id=874050

Wayne Sun
2013-01-30


Wayne,

I'd argue we need to determine what format we really need the data in.
Do we actually really care about physical sockets? Or should we care
about packages? Because with this specific CPU there are 2 packages in
1 physical socket to form 2 NUMA nodes per package.

I agreed. Though the total number of CPUs is correct, which guarantees
most of the stuffs related with CPU topology work. But it still should
be fixed.


The reason I say this is that we went from NUMA being defined for the
domain working to the domain failing to start up with a cryptic error
message which IMHO is worse.

The flip side of the coin is that we can just strip out all the NUMA
settings when starting the domain up if we know it won't work.


_______________________________________________
libvirt-users mailing list
libvirt-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvirt-users



[Index of Archives]     [Virt Tools]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux