Re: Understanding write performance

Gregory Farnum <gfarnum@xxxxxxxxxx> · Mon, 22 Aug 2016 13:46:05 -0700

Apart from what Christian said...

On Thu, Aug 18, 2016 at 12:03 PM, lewis.george@xxxxxxxxxxxxx
<lewis.george@xxxxxxxxxxxxx> wrote:
> Hi,
> So, I have really been trying to find information about this without
> annoying the list, but I just can't seem to get any clear picture of it. I
> was going to try to search the mailing list archive, but it seems there is
> an error when trying to search it right now(posting below, and sending to
> listed address in error).
>
> I have been working for a couple of months now(slowly) on testing out Ceph.
> I only have a small PoC setup. I have 6 hosts, but I am only using 3 of them
> in the cluster at the moment. They each have 6xSSDs(only 5 usable by Ceph),
> but the networks(1 public, 1 cluster) are only 1Gbps. I have the MONs
> running on the same 3 hosts, and I have an OSD process running for each of
> the 5 disks per host. The cluster shows in good health, with 15 OSDs. I have
> one pool there, the default rbd, which I setup with 512 PGs.
>
> I have create an rbd image on the pool, and I have it mapped and mounted on
> another client host. When doing write tests, like with 'dd', I am getting
> rather spotty performance. Not only is it up and down, but even when it is
> up, the performance isn't that great. On large'ish(4GB sequential) writes,
> it averages about 65MB/s, and on repeated smaller(40MB) sequential writes,
> it is jumping around between 20MB/s and 80MB/s.

Unless you changed the defaults, you are getting 3 copies of your
data. The client's network can do 125MB/s to each individual OSD, but
the OSD needs to write out twice as much data as it gets in for
writes. 125MB/s (on the outgoing cluster network) divided by 2 is
67.5MB/s — I rather imagine this is a big chunk of what you're seeing.
Especially since your cluster may be doing scrubs and other stuff
using up network bandwidth that will be more noticeable if it gets
interrupted by a smaller write.
-Greg

>
> However, with read tests, I am able to completely max out the network there,
> easily reaching 125MB/s. Tests on the disks directly are able to get up to
> 550MB/s reads and 350MB/s writes. So, I know it isn't a problem with the
> disks.
>
> I guess my question is, is there any additional optimizations or tuning I
> should review here. I have read over all the docs, but I don't know which,
> if any, of the values would need tweaking. Also, I am not sure if this is
> just how it is with Ceph, given the need to write multiple copies of each
> object. Is the slower write performance(averaging ~1/2 of the network
> throughput) to be expected? I haven't seen any clear answer on that in the
> docs or in articles I have found around. So, I am not sure if my expectation
> is just wrong.
>
> Anyway, some basic idea on those concepts or some pointers to some good docs
> or articles would be wonderful. Thank you!
>
> Lewis George
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com