Hi, I'm using a 1GBit network. 16 osd on 8 hosts with xfs and journal on ssd. I have a read performance problem in a libvirt kvm/qemu/rbd VM on a ceph client host. All involved hosts are ubuntu 13.10. Ceph is 72.2. The only VM disk is a rbd volume: <disk type='network' device='disk'> <driver name='qemu' cache='writeback'/> <auth username='openstack-control'> <secret type='ceph' uuid='30f43440-127d-4bf6-ae61-3c0ef03fff39'/> </auth> <source protocol='rbd' name='openstack-control/multius'> <host name='10.37.124.11' port='6789'/> <host name='10.37.124.12' port='6789'/> <host name='10.37.124.13' port='6789'/> </source> <target dev='hdb' bus='ide'/> <alias name='ide0-0-1'/> <address type='drive' controller='0' bus='0' target='0' unit='1'/> </disk> The VM uses ext4 filesystem. Write tests inside the VM shows good performance: dd if=/dev/zero of=zerofile-2 bs=4M count=2048 2048+0 records in 2048+0 records out 8589934592 bytes (8.6 GB) copied, 78.7491 s, 109 MB/s But reading the file: dd if=zerofile-2 of=/dev/null bs=4M count=2048 2048+0 records in 2048+0 records out 8589934592 bytes (8.6 GB) copied, 483.018 s, 17.8 MB/s bonnie++ shows following: Version 1.97 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP multius 8G 1242 95 105787 27 12331 3 3812 66 23231 3 1561 79 Latency 11801us 851ms 2578ms 239ms 368ms 24062us Version 1.97 ------Sequential Create------ --------Random Create-------- multius -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ Latency 16155us 397us 367us 57us 29us 133us 1.97,1.97,multius,1,1389334826,8G,,1242,95,105787,27,12331,3,3812,66,23231,3,1561,79,16,,,,,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,11801us,851ms,2578ms,239ms,368ms,24062us,16155us,397us,367us,57us,29us,133us But on the host rados bench with an other rados pool shows good writing and reading performance: rados bench -p u124-13 --id u124-13 --keyring=/etc/ceph/client.u124-13.keyring 30 write --no-cleanup Total writes made: 850 Write size: 4194304 Bandwidth (MB/sec): 111.813 Stddev Bandwidth: 21.4101 Max bandwidth (MB/sec): 124 Min bandwidth (MB/sec): 0 Average Latency: 0.571974 Stddev Latency: 0.260423 Max latency: 2.63825 Min latency: 0.144089 and reading is not so bad: rados bench -p u124-13 --id u124-13 --keyring=/etc/ceph/client.u124-13.keyring 30 seq Total reads made: 819 Read size: 4194304 Bandwidth (MB/sec): 106.656 Average Latency: 0.598815 Max latency: 2.25945 Min latency: 0.190397 in my ceph.conf should be cache enabled [client.openstack-control] rbd cache = true rbd cache size = 1073741824 rbd cache max dirty = 536870912 rbd default format = 2 admin socket = /var/run/ceph/rbd-$pid.asok rbd cache writethrough until flush = true I guess I misunderstood some configuration options. Has anybody similiar performance problems? Regards, Steffen Thorhauer -- ______________________________________________________________________ Steffen Thorhauer email: sth@xxxxxxxxxxxxxxxxxxxxxxx url: http://wwwiti.cs.uni-magdeburg.de/~thorhaue _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com