On 03/25/2014 06:30 PM, Andrei Mikhailovsky wrote:
Mark, thanks for your reply.
I've tried increasing the readahead values to 4M (8192) for all osds as well as all vm block devices. This did not change performance much, the osd values are still in the region of 30 MB/s.
All vms are using virtio drivers and I've got 3.8.0 kernel on the osd servers from Ubuntu repo.
The osd fragmentation level of zfs is at 8% at the moment, not sure if this should impact the performance by this much. I will defrag it over night and check tomorrow to see if it makes the difference.
Oh! We haven't done a ton of performance testing on zfs yet. Are you
using the fuse driver or the kernel one? If you are using the kernel
driver, make sure you are setting xattr=sa and have atime=off. Other
than that, you may want to do some tests with blktrace running and see
what kind of IO behaviour you are seeing on the disks. This is slightly
wild west territory still.
Cheers
Andrei
----- Original Message -----
From: "Mark Nelson" <mark.nelson@xxxxxxxxxxx>
To: ceph-users@xxxxxxxxxxxxxx
Sent: Tuesday, 25 March, 2014 12:27:21 PM
Subject: Re: rbd + qemu osd performance
On 03/25/2014 07:19 AM, Andrei Mikhailovsky wrote:
Hello guys,
Was hoping someone could help me with strange read performance problems
on osds. I have a test setup of 4 kvm host servers which are running
about 20 test linux vms between them. The vms' images are stored in ceph
cluster and accessed via rbd. I also have 2 osd servers with replica of
2. The physical specs are at the end of the email.
I've been running some test to check concurrent read/write performance
from all vms.
I've simply concurrently fired the following tests on each vm:
"dd if=/dev/vda of=/dev/null bs=1M count=1000 iflag=direct" and after that
"dd if=/dev/vda of=/dev/null bs=4M count=1000 iflag=direct"
While the tests are running i've fired up iostat to monitor osd
performance and load. I've noticed that during the read tests each osd
is reading only about 25-30MB/s, which seems very poor to me. The osd
devices are capable of reading around 150-160MB/s when I fire up dd on
them. I am aware that osds "loose" about 40-50% throughput, so I was
expecting to see each osd doing around 70-80MB/s during my tests. So,
what could be the problem here?
I'd first try increasing read ahead on the OSD disks from 128KB to
something much larger. Anywhere from 512KB to 4MB may be appropriate.
Setting it too large can potentially hurt small random reads, but you'll
just have to try it out. You may also want to play with readahead on
the RBD devices.
Also, make sure you are on kernel 3.8+, and that you are using a virtio
driver. Since these are sequential read tests, another thing to watch
out for is underlying filesystem fragmentation. That can really wreck
havoc. We've especially seen bad behaviour with btrfs when used with
RBD for sequential reads.
I've checked the networking for errors and I can't see any packet drops
or other problems with connectivity. I've also ran various networking
tests for several days bashing servers with high volume of traffic and
I've not had any problems/packet drops/disconnects.
My infrastructure setup is as follows:
1. Software: Ubuntu 12.04.4 with Ceph version 0.72.2, Qemu 1.5 from
Ubuntu Cloud repo
2. OSD servers: 2xIntel E5-2620 with 12 cores in total, 24GB Ram and 8
SAS 7k rpm disks each
3. Networking - QDR 40gbit/s Infiniband using IP over Infiniband.
Many thanks
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com