День добрий! Tue, Oct 15, 2019 at 02:29:58PM +0300, vitalif wrote: > Wow, does it really work? > > And why is it not supported by RBD? I hadn't dive into sources, but it stated in docs. > > Can you show us the latency graphs before and after and tell the I/O pattern > to which the latency applies? Previous common knowledge was that RDMA almost > doesn't affect latency with Ceph, because most of the latency is in Ceph > itself. There is graph here. It was pure Nautilus before 10-05 and Nautilus+RDMA after. https://nc.avalon.org.ua/s/LptPTEaTeTTyKtD Link expires on Nov 1. Most of my clients is OpenStack instances with rbd volumes. Cluster consists of 30 ssd and 10 hdd osds, rbd volumes lies on ssd. It was experiment with RDMA, but it's result was resonably good to test it for a longer time. > >>I was wondering what user experience was with using Ceph over RDMA? > >> - How you set it up? > > > >We had used RoCE Lag with Mellanox ConnectX-4 Lx. > > > >> - Documentation used to set it up? > > > >Generally, Mellanox community docs and Ceph docs: > >https://community.mellanox.com/s/article/bring-up-ceph-rdma---developer-s-guide > > > >> - Known issues when using it? > > > >Ceph's distribution does not include Systemd units with > >LimitMEMLOCK=infinity > >setting. Also it was needed to start Ceph as root to workaround some > >limits. > >Ceph rbd clients, so as mgr daemons, do not suport rdma, so it was needed > >to set > >ms_cluster_type = async+rdma > >ms_type = async+rdma > >ms_public_type = async+posix > >[mgr] > >ms_type = async+posix > > > >And we needed to disable any Jumbo Frames support in order to work with > >RDMA. > > > > > >> - If you still use it? > > > >As I can see on my graphs, it is latency drop with Nautilus+RDMA. As for > >now, > >cluster is up and running for 2 weeks without any issues and with our > >production > >load (rbd, radosgw, cephfs). _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx