Hello!
5565 qemu 20 0 9091m 7.3g 10m S 2.6 7.7 4:41.31 qemu-kvm
5416 qemu 20 0 8059m 6.4g 10m S 27.8 6.8 4:40.93 qemu-kvm
5490 qemu 20 0 6723m 5.3g 10m S 26.2 5.6 4:30.51 qemu-kvm
5591 qemu 20 0 6475m 5.1g 10m S 39.1 5.3 4:35.68 qemu-kvm
5390 qemu 20 0 6227m 4.9g 10m S 2.0 5.1 4:26.42 qemu-kvm
5615 qemu 20 0 6203m 4.8g 10m S 27.5 5.1 4:34.56 qemu-kvm
5692 qemu 20 0 6171m 4.8g 10m S 17.5 5.1 4:28.95 qemu-kvm
5666 qemu 20 0 6163m 4.8g 10m S 2.0 5.1 4:29.66 qemu-kvm
5740 qemu 20 0 6139m 4.8g 10m S 23.2 5.1 4:39.22 qemu-kvm
5716 qemu 20 0 5899m 4.6g 10m S 20.2 4.8 4:30.84 qemu-kvm
5539 qemu 20 0 5827m 4.5g 10m S 1.7 4.8 4:27.02 qemu-kvm
5515 qemu 20 0 5651m 4.4g 10m S 4.6 4.7 4:25.20 qemu-kvm
5640 qemu 20 0 5603m 4.3g 10m S 6.6 4.6 4:28.90 qemu-kvm
5442 qemu 20 0 5373m 4.1g 10m S 2.3 4.4 4:28.45 qemu-kvm
5466 qemu 20 0 5387m 4.1g 10m S 41.7 4.3 4:41.00 qemu-kvm
I observe very high memory consumption on client with write-intensive load with qemu 1.6.0 + librbd 0.67.3.
For benchmarking purposes I'm trying to simultaneously run 15 VMs with 3 GiB of RAM on the one host. Each VM uses RBD image cloned from protected snapshot of "master image". After boot of each VM, "rpm -ihv" with a bunch of really large RPMs (~8 GiB of unpacked small files) is automatically started. Here is part of libvirt's XML of one of these VMs:
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='writeback'/>
<source protocol='rbd' name='storage-benchmark-vms/vm-image-1:rbd_cache=1:rbd_cache_max_dirty=134217728:rbd_cache_size=268435456:rbd_cache_max_dirty_age=20'>
<host name='192.168.0.1' port='6789'/>
<host name='192.168.0.2' port='6789'/>
<host name='192.168.0.3' port='6789'/>
</source>
<target dev='hda' bus='ide'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
Some time after start I can see unexpected growth of memory consumption of qemu-kvm processes:
5416 qemu 20 0 8059m 6.4g 10m S 27.8 6.8 4:40.93 qemu-kvm
5490 qemu 20 0 6723m 5.3g 10m S 26.2 5.6 4:30.51 qemu-kvm
5591 qemu 20 0 6475m 5.1g 10m S 39.1 5.3 4:35.68 qemu-kvm
5390 qemu 20 0 6227m 4.9g 10m S 2.0 5.1 4:26.42 qemu-kvm
5615 qemu 20 0 6203m 4.8g 10m S 27.5 5.1 4:34.56 qemu-kvm
5692 qemu 20 0 6171m 4.8g 10m S 17.5 5.1 4:28.95 qemu-kvm
5666 qemu 20 0 6163m 4.8g 10m S 2.0 5.1 4:29.66 qemu-kvm
5740 qemu 20 0 6139m 4.8g 10m S 23.2 5.1 4:39.22 qemu-kvm
5716 qemu 20 0 5899m 4.6g 10m S 20.2 4.8 4:30.84 qemu-kvm
5539 qemu 20 0 5827m 4.5g 10m S 1.7 4.8 4:27.02 qemu-kvm
5515 qemu 20 0 5651m 4.4g 10m S 4.6 4.7 4:25.20 qemu-kvm
5640 qemu 20 0 5603m 4.3g 10m S 6.6 4.6 4:28.90 qemu-kvm
5442 qemu 20 0 5373m 4.1g 10m S 2.3 4.4 4:28.45 qemu-kvm
5466 qemu 20 0 5387m 4.1g 10m S 41.7 4.3 4:41.00 qemu-kvm
It could grow up further:
5565 qemu 20 0 22.6g 18g 2772 S 2.6 20.0 6:07.40 qemu-kvm
And then free some part of memory at some point:
5565 qemu 20 0 8011m 6.0g 2796 S 2.3 6.3 6:23.10 qemu-kvm
I tried to reduce cache size to defaults, as suggested on #ceph (replace "rbd_cache=1:rbd_cache_max_dirty=134217728:rbd_cache_size=268435456:rbd_cache_max_dirty_age=20" with just "rbd_cache=1"), but it didn't help much:
15297 qemu 20 0 7747m 6.1g 10m S 1.0 6.4 4:47.26 qemu-kvm
Then I tried to disable cache (remove "cache='writeback'" and change rbd_cache to 0), and memory consumption became normal:
19590 qemu 20 0 4251m 3.0g 10m S 9.2 3.2 3:33.42 qemu-kvm
19526 qemu 20 0 4251m 3.0g 10m S 8.6 3.1 3:22.01 qemu-kvm
19399 qemu 20 0 4251m 3.0g 10m S 9.6 3.1 3:15.01 qemu-kvm
19612 qemu 20 0 4251m 3.0g 10m S 3.0 3.1 4:12.41 qemu-kvm
19568 qemu 20 0 4251m 3.0g 10m S 3.0 3.1 3:32.04 qemu-kvm
19632 qemu 20 0 4251m 3.0g 10m S 7.3 3.1 3:47.57 qemu-kvm
19419 qemu 20 0 4251m 3.0g 10m S 8.9 3.1 3:20.40 qemu-kvm
19484 qemu 20 0 4251m 3.0g 10m S 7.6 3.1 3:30.56 qemu-kvm
19676 qemu 20 0 4251m 3.0g 10m S 4.0 3.1 3:48.99 qemu-kvm
19654 qemu 20 0 4251m 3.0g 10m S 7.3 3.1 3:49.83 qemu-kvm
19464 qemu 20 0 4251m 3.0g 10m S 8.9 3.1 3:45.45 qemu-kvm
19441 qemu 20 0 4251m 3.0g 10m S 7.3 3.1 3:20.58 qemu-kvm
19377 qemu 20 0 4251m 3.0g 10m S 7.9 3.1 3:16.99 qemu-kvm
19548 qemu 20 0 4251m 3.0g 10m S 9.9 3.1 3:33.59 qemu-kvm
19506 qemu 20 0 4251m 3.0g 10m S 7.6 3.1 3:16.94 qemu-kvm
I also tried to drop all caches inside one of the VMs and see how memory usage of qemu-kvm will change:
killall -s STOP rpm
sync
echo 3 >/proc/sys/vm/drop_caches
But it didn't made any difference outside of VM (except CPU usage because of SIGSTOP).
Maybe it's a bug in librbd or qemu/rbd?
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com