Re: Ceph 0.94 (and lower) performance on >1 hosts ??

Gregory Farnum <greg@xxxxxxxxxxx> · Wed, 22 Jul 2015 15:01:01 +0100



We might also be able to help you improve or better understand your
results if you can tell us exactly what tests you're conducting that
are giving you these numbers.
-Greg

On Wed, Jul 22, 2015 at 4:44 AM, Florent MONTHEL <fmonthel@xxxxxxxxxxxxx> wrote:
> Hi Frederic,
>
> When you have Ceph cluster with 1 node you don’t experienced network and
> communication overhead due to distributed model
> With 2 nodes and EC 4+1 you will have communication between 2 nodes but you
> will keep internal communication (2 chunks on first node and 3 chunks on
> second node)
> On your configuration EC pool is setup with 4+1 so you will have for each
> write overhead due to write spreading on 5 nodes (for 1 customer IO, you
> will experience 5 Ceph IO due to EC 4+1)
> It’s the reason for that I think you’re reaching performance stability with
> 5 nodes and more in your cluster
>
>
> On Jul 20, 2015, at 10:35 AM, SCHAER Frederic <frederic.schaer@xxxxxx>
> wrote:
>
> Hi,
>
> As I explained in various previous threads, I’m having a hard time getting
> the most out of my test ceph cluster.
> I’m benching things with rados bench.
> All Ceph hosts are on the same 10GB switch.
>
> Basically, I know I can get about 1GB/s of disk write performance per host,
> when I bench things with dd (hundreds of dd threads) +iperf 10gbit
> inbound+iperf 10gbit outbound.
> I also can get 2GB/s or even more if I don’t bench the network at the same
> time, so yes, there is a bottleneck between disks and network, but I can’t
> identify which one, and it’s not relevant for what follows anyway
> (Dell R510 + MD1200 + PERC H700 + PERC H800 here, if anyone has hints about
> this strange bottleneck though…)
>
> My hosts each are connected though a single 10Gbits/s link for now.
>
> My problem is the following. Please note I see the same kind of poor
> performance with replicated pools...
> When testing EC pools, I ended putting a 4+1 pool on a single node in order
> to track down the ceph bottleneck.
> On that node, I can get approximately 420MB/s write performance using rados
> bench, but that’s fair enough since the dstat output shows that real data
> throughput on disks is about 800+MB/s (that’s the ceph journal effect, I
> presume).
>
> I tested Ceph on my other standalone nodes : I can also get around 420MB/s,
> since they’re identical.
> I’m testing things with 5 10Gbits/s clients, each running rados bench.
>
> But what I really don’t get is the following :
>
> -          With 1 host : throughput is 420MB/s
> -          With 2 hosts : I get 640MB/s. That’s surely not 2x420MB/s.
> -          With 5 hosts : I get around 1375MB/s . That’s far from the
> expected 2GB/s.
>
> The network never is maxed out, nor are the disks or CPUs.
> The hosts throughput I see with rados bench seems to match the dstat
> throughput.
> That’s as if each additional host was only capable of adding 220MB/s of
> throughput. Compare this to the 1GB/s they are capable of (420MB/s with
> journals)…
>
> I’m therefore wondering what could possibly be so wrong with my setup ??
> Why would it impact so much the performance to add hosts ?
>
> On the hardware side, I have Broadcam BCM57711 10-Gigabit PCIe cards.
> I know, not perfect, but not THAT bad neither… ?
>
> Any hint would be greatly appreciated !
>
> Thanks
> Frederic Schaer
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com