Hello, Definitely seeing about 20% overhead with Hammer as well, so not version specific from where I'm standing. While non-RBD storage VMs by and large tend to be closer the specified size, I've seen them exceed things by few % at times, too. For example a 4317968KB RSS one that ought to be 4GB. Regards, Christian On Thu, 27 Apr 2017 09:56:48 +0200 nick wrote: > Hi, > we are running a jewel ceph cluster which serves RBD volumes for our KVM > virtual machines. Recently we noticed that our KVM machines use a lot more > memory on the physical host system than what they should use. We collect the > data with a python script which basically executes 'virsh dommemstat <virtual > machine name>'. We also verified the results of the script with the memory > stats of 'cat /proc/<kvm PID>/status' for each virtual machine and the results > are the same. > > Here is an excerpt for one pysical host where all virtual machines are running > since yesterday (virtual machine names removed): > > """ > overhead actual percent_overhead rss > ---------- -------- ---------------- -------- > 423.8 MiB 2.0 GiB 20 2.4 GiB > 460.1 MiB 4.0 GiB 11 4.4 GiB > 471.5 MiB 1.0 GiB 46 1.5 GiB > 472.6 MiB 4.0 GiB 11 4.5 GiB > 681.9 MiB 8.0 GiB 8 8.7 GiB > 156.1 MiB 1.0 GiB 15 1.2 GiB > 278.6 MiB 1.0 GiB 27 1.3 GiB > 290.4 MiB 1.0 GiB 28 1.3 GiB > 291.5 MiB 1.0 GiB 28 1.3 GiB > 0.0 MiB 16.0 GiB 0 13.7 GiB > 294.7 MiB 1.0 GiB 28 1.3 GiB > 135.6 MiB 1.0 GiB 13 1.1 GiB > 0.0 MiB 2.0 GiB 0 1.4 GiB > 1.5 GiB 4.0 GiB 37 5.5 GiB > """ > > We are using the rbd client cache for our virtual machines, but it is set to > only 128MB per machine. There is also only one rbd volume per virtual machine. > We have seen more than 200% memory overhead per KVM machine on other physical > machines. After a live migration of the virtual machine to another host the > overhead is back to 0 and increasing slowly back to high values. > > Here are our ceph.conf settings for the clients: > """ > [client] > rbd cache writethrough until flush = False > rbd cache max dirty = 100663296 > rbd cache size = 134217728 > rbd cache target dirty = 50331648 > """ > > We noticed this behavior since we are using the jewel librbd libraries. We did > not encounter this behavior when using the ceph infernalis librbd version. We > also do not see this issue when using local storage, instead of ceph. > > Some version information of the physical host which runs the KVM machines: > """ > OS: Ubuntu 16.04 > kernel: 4.4.0-75-generic > librbd: 10.2.7-1xenial > """ > > We did try to flush and invalidate the client cache via the ceph admin socket, > but this did not change any memory usage values. > > Does anyone encounter similar issues or does have an explanation for the high > memory overhead? > > Best Regards > Sebastian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com