Re: Infiniband support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We've used RDMA via RoCEv2 on 100GbE.  It ran in production that way for at least 6 months before I had to turn it off when doing some migrations using hardware that didn't support it.  We noticed no performance change in our environment so once we were done I just never turned it back on.  I'm not even sure we could right now with how we have our network topology / bond interfaces

The biggest annoyance was making sure the device name and gid were correct.  This was before the ceph config stuff existed so it may be easier now to roll that one out.

Example config section for one of my nodes (in the global part under public+cluster network):

ms_cluster_type = async+rdma
ms_async_rdma_device_name = mlx5_1
ms_async_rdma_polling_us = 0
ms_async_rdma_local_gid = 0000:0000:0000:0000:0000:ffff:c1b8:4fa0
ms_async_rdma_roce_ver = 1

We pulled the GID in ansible with:

- name: "Insert RDMA GID into ceph.conf"
  shell: sed -i s/GIDGOESHERE/$(cat /sys/class/infiniband/mlx5_1/ports/1/gids/5)/g /etc/ceph/ceph.conf
  args:
    warn: no 

The stub config file we pushed had "GIDGOESHERE" in it.

I hope that helps someone out there.  Not all of the settings were obvious and it took some trial and error.   Now that we have a pure NVMe tier I'll probably try and turn it back on to see if we notice any changes.

Netdata also proved to be a valuable tool to make sure we had traffic in both TCP and RDMA
https://www.netdata.cloud/


--
Paul Mezzanini
Sr Systems Administrator / Engineer, Research Computing
Information & Technology Services
Finance & Administration
Rochester Institute of Technology
o:(585) 475-3245 | pfmeec@xxxxxxx

CONFIDENTIALITY NOTE: The information transmitted, including attachments, is
intended only for the person(s) or entity to which it is addressed and may
contain confidential and/or privileged material. Any review, retransmission,
dissemination or other use of, or taking of any action in reliance upon this
information by persons or entities other than the intended recipient is
prohibited. If you received this in error, please contact the sender and
destroy any copies of this information.
------------------------

________________________________________
From: Andrei Mikhailovsky <andrei@xxxxxxxxxx>
Sent: Wednesday, August 26, 2020 5:55 PM
To: Rafael Quaglio
Cc: ceph-users
Subject:  Re: Infiniband support

Rafael, We've been using ceph with ipoib for over 7 years and it's been supported. However, I am not too sure of the the native rdma support. There has been discussions on/off for a while now, but I've not seen much. Perhaps others know.

Cheers

> From: "Rafael Quaglio" <quaglio@xxxxxxxxxx>
> To: "ceph-users" <ceph-users@xxxxxxx>
> Sent: Wednesday, 26 August, 2020 14:08:57
> Subject:  Infiniband support

> Hi,
> I could not see in the doc if Ceph has infiniband support. Is there someone
> using it?
> Also, is there any rdma support working natively?

> Can anyoune point me where to find more information about it?

> Thanks,
> Rafael.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux