On 07/23/2014 03:54 AM, Andrei Mikhailovsky wrote: > Ricardo, > > Thought to share my testing results. > > I've been using IPoIB with ceph for quite some time now. I've got QDR > osd/mon/client servers to serve rbd images to kvm hypervisor. I've done > some performance testing using both rados and guest vm benchmarks while > running the last three stable versions of ceph. > > My conclusion was that ceph itself needs to mature and/or be optimised > in order to utilise the capabilities of the infiniband link. In my > experience, I was not able to reach the limits of the network speeds > reported to me by the network performance monitoring tools. I was > struggling to push data throughput beyond 1.5GB/s while using between 2 > and 64 concurrent tests. This was the case when the benchmark data was > using the same data over and over again and the data was cached on the > osd servers and was coming directly from server's ram without any access > to the osds themselves. > > My ipoib network performance tests were showing on average 2.5-3GB/s > with peaks reaching 3.3GB/s over ipoib. It would be nice to see how ceph > is performing over rdma ))). > > Having said this, perhaps my test gear is somewhat limited or my ceph > optimisation was not done correctly. I had 2 osd servers with 8 osds > each and three clients running guest vms and rados benchmarks. None of > the benchmarks were able to fully utilise the server resources. my osd > servers were running on about 50% utilisation during the tests. > > So, I had to conclude that unless you are running a large cluster with > some specific data sets that utilise multithreading you will probably > not need to have an infiniband link. A single thread performance for the > cold data will be limited to about 1/2 of the speed of a single osd > device. So, if your osds are running 150MB/s do not expect to have a > single thread faster than 70-80MB/s. > > On the other hand, if you utilise high performance gear, like cache > cards capable of achieving speeds of over gigabytes per second, perhaps > infiniband link might be of use. Not sure if the ceph-osd process is > capable of "spitting" out this amount of data though. You might be > having a CPU bottleneck. FWIW, when we were testing Ceph with QDR IB at ORNL, we topped out at around 2GB/s per server node with IPoIB. This was with a rather unconventional setup though with a DDN SFA10K and RAID5 LUNs with lots of disks per OSD. On my (more conventional) high performance test box, I can hit 2GB/s with 24 disks, 8 ssds, and 4 SAS2308 controllers, at least when streaming 4MB objects in and out of rados. I suspect for most people 10GbE will be fast enough for many workloads (though QDR IB might be cheaper if you know how to implement it!) > > Andrei > > > ------------------------------------------------------------------------ > *From: *"Sage Weil" <sweil at redhat.com> > *To: *"Riccardo Murri" <riccardo.murri at uzh.ch> > *Cc: *ceph-users at lists.ceph.com > *Sent: *Tuesday, 22 July, 2014 9:42:56 PM > *Subject: *Re: [ceph-users] Ceph and Infiniband > > On Tue, 22 Jul 2014, Riccardo Murri wrote: > > Hello, > > > > a few questions on Ceph's current support for Infiniband > > > > (A) Can Ceph use Infiniband's native protocol stack, or must it use > > IP-over-IB? Google finds a couple of entries in the Ceph wiki related > > to native IB support (see [1], [2]), but none of them seems finished > > and there is no timeline. > > > > [1]: > https://wiki.ceph.com/Planning/Blueprints/Emperor/msgr%3A_implement_infiniband_support_via_rsockets > > [2]: > http://wiki.ceph.com/Planning/Blueprints/Giant/Accelio_RDMA_Messenger > > This is work in progress. We hope to get basic support into the tree > in the next couple of months. > > > (B) Can we connect to the same Ceph cluster from Infiniband *and* > > Ethernet? Some clients do only have Ethernet and will not be > > upgraded, some others would have QDR Infiniband -- we would like both > > sets to access the same storage cluster. > > This is further out. Very early refactoring to make this work in > wip-addr. > > > (C) I found this old thread about Ceph's performance on 10GbE and > > Infiniband: are the issues reported there still current? > > > > http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/6816 > > No idea! :) > > sage > > > > > > > Thanks for any hint! > > > > Riccardo > > > > -- > > Riccardo Murri > > http://www.s3it.uzh.ch/about/team/ > > > > S3IT: Services and Support for Science IT > > University of Zurich > > Winterthurerstrasse 190, CH-8057 Z?rich (Switzerland) > > Tel: +41 44 635 4222 > > Fax: +41 44 635 6888 > > _______________________________________________ > > ceph-users mailing list > > ceph-users at lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >