Re: Ceph Bluestore performance question

Oliver Freyermuth <freyermuth@xxxxxxxxxxxxxxxxxx> · Wed, 21 Feb 2018 00:06:19 +0100

Answering the first RDMA question myself... 

Am 18.02.2018 um 16:45 schrieb Oliver Freyermuth:
> This leaves me with two questions:
> - Is it safe to use RDMA with 12.2.2 already? Reading through this mail archive, 
>   I grasped it may lead to memory exhaustion and in any case needs some hacks to the systemd service files. 

I tried that on our cluster and while I had a running cluster for a few minutes, I ran into many random disconnects,
mons and mgrs disconnecting, osds vanishing, no client being able to connect... 
I got the very same issues described here:
https://tracker.ceph.com/issues/22944
I'm also on CentOS 7.4, with Connect-X3 cards, but was not using a modern Mellanox OFED, but the stack that came with CentOS 7.4. 

Hence, I reverted to IPoIB.
However, I got a significant performance improvement (> 2x) by switching 
to mode "connected" and MTU 65520 instead of mode "datagram" and MTU 2044 as outlined e.g. here:
https://wiki.gentoo.org/wiki/InfiniBand#Performance_tuning

Total throughput in iperf (send + recv) is now about 30 GBit/s. 

Even though this is not "perfect" (harddrives are a bit bored...), it's sufficient for our usecase and runs very stable. I'll try some sysctl tuning in the next days. 

> - Is it already clear whether RDMA will be part of 12.2.3? 
> 
> Also, of course the final question from the last mail:
> "Why is data moved in a k=4 m=2 EC-pool with 6 hosts and failure domain "host" after failure of one host?"
> is still open. 
> 
> Many thanks already, this helped a lot to understand things better!
> 
> Cheers,
> Oliver
> 

Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com