Thanks for the insights Mohammad and Roman. Interesting read. My interest in RDMA is purely from testing perspective. Still I would be interested if somebody who has RDMA enabled and running, to share their ceph.conf. My RDMA related entries are taken from Mellanox blog here https://community.mellanox.com/s/article/bring-up-ceph-rdma---developer-s-guide. They used Luminous and built it from source. I'm running binary distribution of Mimic here. ms_type = async+rdma ms_cluster = async+rdma ms_async_rdma_device_name = mlx5_0 ms_async_rdma_polling_us = 0 ms_async_rdma_local_gid=<node's_gid> Or, if somebody with knowledge of the code could tell me when is this "RDMAConnectedSocketImpl" error is printed might also be helpful. 2018-12-19 21:45:32.757 7f52b8548140 0 mon.rio@-1(probing).osd e25981 crush map has features 288514051259236352, adjusting msgr requires 2018-12-19 21:45:32.757 7f52b8548140 0 mon.rio@-1(probing).osd e25981 crush map has features 288514051259236352, adjusting msgr requires 2018-12-19 21:45:32.757 7f52b8548140 0 mon.rio@-1(probing).osd e25981 crush map has features 1009089991638532096, adjusting msgr requires 2018-12-19 21:45:32.757 7f52b8548140 0 mon.rio@-1(probing).osd e25981 crush map has features 288514051259236352, adjusting msgr requires 2018-12-19 21:45:33.138 7f52b8548140 0 mon.rio@-1(probing) e5 my rank is now 0 (was -1) 2018-12-19 21:45:33.141 7f529f3fe700 -1 RDMAConnectedSocketImpl activate failed to transition to RTR state: (113) No route to host 2018-12-19 21:45:33.142 7f529f3fe700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.2/rpm/el7/BUILD/ceph-13.2.2/src/msg/async/rdma/RDMAConnectedSocketImpl.cc: In function 'void RDMAConnectedSocketImpl::handle_connection()' thread 7f529f3fe700 time 2018-12-19 21:45:33.141972 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.2/rpm/el7/BUILD/ceph-13.2.2/src/msg/async/rdma/RDMAConnectedSocketImpl.cc: 224: FAILED assert(!r) -- Michael Green
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com