Re: rbd + qemu osd performance

Andrei Mikhailovsky <andrei@xxxxxxxxxx> · Wed, 26 Mar 2014 13:45:05 +0000 (GMT)

Gusy, sorry for confusion. I've meant to say that I am using XFS _NOT_ ZFS. The rest of the information is correct.

Cheers
----- Original Message -----
From: "Mark Nelson" <mark.nelson@xxxxxxxxxxx>
To: "Andrei Mikhailovsky" <andrei@xxxxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx
Sent: Wednesday, 26 March, 2014 1:39:37 AM
Subject: Re:  rbd + qemu osd performance

On 03/25/2014 06:30 PM, Andrei Mikhailovsky wrote:
> Mark, thanks for your reply.
>
> I've tried increasing the readahead values to 4M (8192) for all osds as well as all vm block devices. This did not change performance much, the osd values are still in the region of 30 MB/s.
>
> All vms are using virtio drivers and I've got 3.8.0 kernel on the osd servers from Ubuntu repo.
>
>
> The osd fragmentation level of zfs is at 8% at the moment, not sure if this should impact the performance by this much. I will defrag it over night and check tomorrow to see if it makes the difference.

Oh!  We haven't done a ton of performance testing on zfs yet.  Are you 
using the fuse driver or the kernel one?  If you are using the kernel 
driver, make sure you are setting xattr=sa and have atime=off.  Other 
than that, you may want to do some tests with blktrace running and see 
what kind of IO behaviour you are seeing on the disks.  This is slightly 
wild west territory still.

>
> Cheers
>
> Andrei
>
> ----- Original Message -----
> From: "Mark Nelson" <mark.nelson@xxxxxxxxxxx>
> To: ceph-users@xxxxxxxxxxxxxx
> Sent: Tuesday, 25 March, 2014 12:27:21 PM
> Subject: Re:  rbd + qemu osd performance
>
> On 03/25/2014 07:19 AM, Andrei Mikhailovsky wrote:
>> Hello guys,
>>
>> Was hoping someone could help me with strange read performance problems
>> on osds. I have a test setup of 4 kvm host servers which are running
>> about 20 test linux vms between them. The vms' images are stored in ceph
>> cluster and accessed via rbd. I also have 2 osd servers with replica of
>> 2. The physical specs are at the end of the email.
>>
>> I've been running some test to check concurrent read/write performance
>> from all vms.
>>
>> I've simply concurrently fired the following tests on each vm:
>>
>> "dd if=/dev/vda of=/dev/null bs=1M count=1000 iflag=direct" and after that
>> "dd if=/dev/vda of=/dev/null bs=4M count=1000 iflag=direct"
>>
>>
>> While the tests are running i've fired up iostat to monitor osd
>> performance and load. I've noticed that during the read tests each osd
>> is reading only about 25-30MB/s, which seems very poor to me. The osd
>> devices are capable of reading around 150-160MB/s when I fire up dd on
>> them. I am aware that osds "loose" about 40-50% throughput, so I was
>> expecting to see each osd doing around 70-80MB/s during my tests. So,
>> what could be the problem here?
>
> I'd first try increasing read ahead on the OSD disks from 128KB to
> something much larger.  Anywhere from 512KB to 4MB may be appropriate.
> Setting it too large can potentially hurt small random reads, but you'll
> just have to try it out.  You may also want to play with readahead on
> the RBD devices.
>
> Also, make sure you are on kernel 3.8+, and that you are using a virtio
> driver.  Since these are sequential read tests, another thing to watch
> out for is underlying filesystem fragmentation.  That can really wreck
> havoc.  We've especially seen bad behaviour with btrfs when used with
> RBD for sequential reads.
>
>>
>> I've checked the networking for errors and I can't see any packet drops
>> or other problems with connectivity. I've also ran various networking
>> tests for several days bashing servers with high volume of traffic and
>> I've not had any problems/packet drops/disconnects.
>>
>> My infrastructure setup is as follows:
>>
>> 1. Software: Ubuntu 12.04.4 with Ceph version 0.72.2, Qemu 1.5 from
>> Ubuntu Cloud repo
>> 2. OSD servers: 2xIntel E5-2620 with 12 cores in total, 24GB Ram and 8
>> SAS 7k rpm disks each
>> 3. Networking - QDR 40gbit/s Infiniband using IP over Infiniband.
>>
>> Many thanks
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com