Re: KVM with hugepages generate huge load with two guests

Dmitry Golubev <lastguru@xxxxxxxxx> · Sat, 2 Oct 2010 02:50:31 +0300

Hi,

Thanks for reply. Well, although there is plenty of RAM left (about
100MB), some swap space was used during the operation:

Mem:   8193472k total,  8089788k used,   103684k free,     5768k buffers
Swap: 11716412k total,    36636k used, 11679776k free,   103112k cached

I am not sure why, though. Are you saying that there are bursts of
memory usage that push some pages to swap and they are not unswapped
although used? I will try to replicate the problem now and send you
some better printout from the moment the problem happens. I have not
noticed anything unusual when I was watching the system - there was
plenty of RAM free and a few megabytes in swap... Is there any kind of
check I can try during the problem occurring? Or should I free
50-100MB from hugepages and the system shall be stable again?

Thanks,
Dmitry

On Sat, Oct 2, 2010 at 1:30 AM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
> On Thu, Sep 30, 2010 at 12:07:15PM +0300, Dmitry Golubev wrote:
>> Hi,
>>
>> I am not sure what's really happening, but every few hours
>> (unpredictable) two virtual machines (Linux 2.6.32) start to generate
>> huge cpu loads. It looks like some kind of loop is unable to complete
>> or something...
>>
>> So the idea is:
>>
>> 1. I have two linux 2.6.32 x64 (openvz, proxmox project) guests
>> running on linux 2.6.35 x64 (ubuntu maverick) host with a Q6600
>> Core2Quad on qemu-kvm 0.12.5 and libvirt 0.8.3 and another one small
>> 32bit linux virtual machine (16MB of ram) with a router inside (i
>> doubt it contributes to the problem).
>>
>> 2. All these machines use hufetlbfs. The server has 8GB of RAM, I
>> reserved 3696 huge pages (page size is 2MB) on the server, and I am
>> running the main guests each having 3550MB of virtual memory. The
>> third guest, as I wrote before, takes 16MB of virtual memory.
>>
>> 3. Once run, the guests reserve huge pages for themselves normally. As
>> mem-prealloc is default, they grab all the memory they should have,
>> leaving 6 pages unreserved (HugePages_Free - HugePages_Rsvd = 6) all
>> times - so as I understand they should not want to get any more,
>> right?
>>
>> 4. All virtual machines run perfectly normal without any disturbances
>> for few hours. They do not, however, use all their memory, so maybe
>> the issue arises when they pass some kind of a threshold.
>>
>> 5. At some point of time both guests exhibit cpu load over the top
>> (16-24). At the same time, host works perfectly well, showing load of
>> 8 and that both kvm processes use CPU equally and fully. This point of
>> time is unpredictable - it can be anything from one to twenty hours,
>> but it will be less than a day. Sometimes the load disappears in a
>> moment, but usually it stays like that, and everything works extremely
>> slow (even a 'ps' command executes some 2-5 minutes).
>>
>> 6. If I am patient, I can start rebooting the gueat systems - once
>> they have restarted, everything returns to normal. If I destroy one of
>> the guests (virsh destroy), the other one starts working normally at
>> once (!).
>>
>> I am relatively new to kvm and I am absolutely lost here. I have not
>> experienced such problems before, but recently I upgraded from ubuntu
>> lucid (I think it was linux 2.6.32, qemukvm 0.12.3 and libvirt 0.7.5)
>> and started to use hugepages. These two virtual machines are not
>> normally run on the same host system (i have a corosync/pacemaker
>> cluster with drbd storage), but when one of the hosts is not
>> abailable, they start running on the same host. That is the reason I
>> have not noticed this earlier.
>>
>> Unfortunately, I don't have any spare hardware to experiment and this
>> is a production system, so my debugging options are rather limited.
>>
>> Do you have any ideas, what could be wrong?
>
> Is there swapping activity on the host when this happens?
>
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html