Re: Ceph memory overhead when used with KVM

nick <nick@xxxxxxx> · Tue, 02 May 2017 08:51:27 +0200

Hi Jason,
thanks for your feedback. I did now some tests over the weekend to verify the 
memory overhead. 
I was using qemu 2.8 (taken from the Ubuntu Cloud Archive) with librbd 10.2.7 
on Ubuntu 16.04 hosts. I suspected the ceph rbd cache to be the cause of the 
overhead so I just generated a lot of IO with the help of fio in the VMs (with 
a datasize of 80GB) . All VMs had 3GB of memory. I had to run fio multiple 
times, before reaching high RSS values.
I also noticed that when using larger blocksizes during writes (like 4M) the 
memory overhead in the KVM process increased faster.
I ran several fio tests (one after another) and the results are:

KVM with writeback RBD cache: max. 85% memory overhead (2.5 GB overhead)
KVM with writethrough RBD cache: max. 50% memory overhead
KVM without RBD caching: less than 10% overhead all the time
KVM with local storage (logical volume used): 8% overhead all the time

I did not reach those >200% memory overhead results that we see on our live 
cluster, but those virtual machines have a way longer uptime as well.

I also tried to reduce the RSS memory value with cache dropping on the 
physical host and in the VM. Both did not lead to any change. A reboot of the 
VM also does not change anything (reboot in the VM, not a new KVM process). 
The only way to reduce the RSS memory value is a live migration so far. Might 
this be a bug? The memory overhead sounds a bit too much for me.

Best Regards
Sebastian

On Thursday, April 27, 2017 10:08:36 AM you wrote:
> I know we noticed high memory usage due to librados in the Ceph
> multipathd checker [1] -- the order of hundreds of megabytes. That
> client was probably nearly as trivial as an application can get and I
> just assumed it was due to large monitor maps being sent to the client
> for whatever reason. Since we changed course on our RBD iSCSI
> implementation, unfortunately the investigation into this high memory
> usage fell by the wayside.
> 
> [1]
> http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=blob;f=libmultip
> ath/checkers/rbd.c;h=9ea0572f2b5bd41b80bf2601137b74f92bdc7278;hb=HEAD
> On Thu, Apr 27, 2017 at 5:26 AM, nick <nick@xxxxxxx> wrote:
> > Hi Christian,
> > thanks for your answer.
> > The highest value I can see for a local storage VM in our infrastructure
> > is a memory overhead of 39%. This is big, but the majority (>90%) of our
> > local storage VMs are using less than 10% memory overhead.
> > For ceph storage based VMs this looks quite different. The highest value I
> > can see currently is 244% memory overhead. So that specific allocated 3GB
> > memory VM is using now 10.3 GB RSS memory on the physical host. This is a
> > really huge value. In general I can see that the majority of the ceph
> > based VMs has more than 60% memory overhead.
> > 
> > Maybe this is also a bug related to qemu+librbd. It would be just nice to
> > know if other people are seeing those high values as well.
> > 
> > Cheers
> > Sebastian
> > 
> > On Thursday, April 27, 2017 06:10:48 PM you wrote:
> >> Hello,
> >> 
> >> Definitely seeing about 20% overhead with Hammer as well, so not version
> >> specific from where I'm standing.
> >> 
> >> While non-RBD storage VMs by and large tend to be closer the specified
> >> size, I've seen them exceed things by few % at times, too.
> >> For example a 4317968KB RSS one that ought to be 4GB.
> >> 
> >> Regards,
> >> 
> >> Christian
> >> 
> >> On Thu, 27 Apr 2017 09:56:48 +0200 nick wrote:
> >> > Hi,
> >> > we are running a jewel ceph cluster which serves RBD volumes for our
> >> > KVM
> >> > virtual machines. Recently we noticed that our KVM machines use a lot
> >> > more
> >> > memory on the physical host system than what they should use. We
> >> > collect
> >> > the data with a python script which basically executes 'virsh
> >> > dommemstat
> >> > <virtual machine name>'. We also verified the results of the script
> >> > with
> >> > the memory stats of 'cat /proc/<kvm PID>/status' for each virtual
> >> > machine
> >> > and the results are the same.
> >> > 
> >> > Here is an excerpt for one pysical host where all virtual machines are
> >> > running since yesterday (virtual machine names removed):
> >> > 
> >> > """
> >> > overhead    actual    percent_overhead  rss
> >> > ----------  --------  ----------------  --------
> >> > 423.8 MiB   2.0 GiB                 20  2.4 GiB
> >> > 460.1 MiB   4.0 GiB                 11  4.4 GiB
> >> > 471.5 MiB   1.0 GiB                 46  1.5 GiB
> >> > 472.6 MiB   4.0 GiB                 11  4.5 GiB
> >> > 681.9 MiB   8.0 GiB                  8  8.7 GiB
> >> > 156.1 MiB   1.0 GiB                 15  1.2 GiB
> >> > 278.6 MiB   1.0 GiB                 27  1.3 GiB
> >> > 290.4 MiB   1.0 GiB                 28  1.3 GiB
> >> > 291.5 MiB   1.0 GiB                 28  1.3 GiB
> >> > 0.0 MiB     16.0 GiB                 0  13.7 GiB
> >> > 294.7 MiB   1.0 GiB                 28  1.3 GiB
> >> > 135.6 MiB   1.0 GiB                 13  1.1 GiB
> >> > 0.0 MiB     2.0 GiB                  0  1.4 GiB
> >> > 1.5 GiB     4.0 GiB                 37  5.5 GiB
> >> > """
> >> > 
> >> > We are using the rbd client cache for our virtual machines, but it is
> >> > set
> >> > to only 128MB per machine. There is also only one rbd volume per
> >> > virtual
> >> > machine. We have seen more than 200% memory overhead per KVM machine on
> >> > other physical machines. After a live migration of the virtual machine
> >> > to
> >> > another host the overhead is back to 0 and increasing slowly back to
> >> > high
> >> > values.
> >> > 
> >> > Here are our ceph.conf settings for the clients:
> >> > """
> >> > [client]
> >> > rbd cache writethrough until flush = False
> >> > rbd cache max dirty = 100663296
> >> > rbd cache size = 134217728
> >> > rbd cache target dirty = 50331648
> >> > """
> >> > 
> >> > We noticed this behavior since we are using the jewel librbd libraries.
> >> > We
> >> > did not encounter this behavior when using the ceph infernalis librbd
> >> > version. We also do not see this issue when using local storage,
> >> > instead
> >> > of ceph.
> >> > 
> >> > Some version information of the physical host which runs the KVM
> >> > machines:
> >> > """
> >> > OS: Ubuntu 16.04
> >> > kernel: 4.4.0-75-generic
> >> > librbd: 10.2.7-1xenial
> >> > """
> >> > 
> >> > We did try to flush and invalidate the client cache via the ceph admin
> >> > socket, but this did not change any memory usage values.
> >> > 
> >> > Does anyone encounter similar issues or does have an explanation for
> >> > the
> >> > high memory overhead?
> >> > 
> >> > Best Regards
> >> > Sebastian
> > 
> > --
> > Sebastian Nickel
> > Nine Internet Solutions AG, Albisriederstr. 243a, CH-8047 Zuerich
> > Tel +41 44 637 40 00 | Support +41 44 637 40 40 | www.nine.ch
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Sebastian Nickel
Nine Internet Solutions AG, Albisriederstr. 243a, CH-8047 Zuerich
Tel +41 44 637 40 00 | Support +41 44 637 40 40 | www.nine.ch
Attachment:
signature.asc

Description: This is a digitally signed message part.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com