Hi All!
I'm trying to run CEPH over RDMA, using a batch of Infiniband Mellanox
MT25408 20GBit (4x DDR) cards.
RDMA is running, rping works between all hosts, and I've configured
10.0.0.x addressing on the ib0 interfaces.
But when enabling RMDA in ceph.conf:
ms_type = async+rdma
ms_async_rdma_device_name = mlx4_0
OSD and MON on all hosts barf:
-5> 2017-08-28 12:03:37.623110 7f2326c9de00 1 Processor -- start
-4> 2017-08-28 12:03:37.624232 7f23205fe700 1 Infiniband
binding_port found active port 1
-3> 2017-08-28 12:03:37.624250 7f23205fe700 1 Infiniband init
assigning: 1024 receive buffers
-2> 2017-08-28 12:03:37.624260 7f23205fe700 1 Infiniband init
assigning: 1024 send buffers
-1> 2017-08-28 12:03:37.624262 7f23205fe700 1 Infiniband init
device allow 4194303 completion entries
0> 2017-08-28 12:03:37.628379 7f23205fe700 -1
/build/ceph-12.1.4/src/msg/async/rdma/Infiniband.cc: In function 'int
Infiniband::MemoryManager::Cluster::fill(uint32_t)' thread 7f23205fe700
time 2017-08-28 12:03:37.624433
/build/ceph-12.1.4/src/msg/async/rdma/Infiniband.cc: 599: FAILED assert(m)
As suggested in the thread
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-June/018943.html
I also tried with lower values for receive and send buffer:
-4> 2017-08-28 12:36:46.997001 7fa28b270700 1 Infiniband
binding_port found active port 1
-3> 2017-08-28 12:36:46.997026 7fa28b270700 1 Infiniband init
assigning: 256 receive buffers
-2> 2017-08-28 12:36:46.997029 7fa28b270700 1 Infiniband init
assigning: 256 send buffers
-1> 2017-08-28 12:36:46.997030 7fa28b270700 1 Infiniband init
device allow 4194303 completion entries
0> 2017-08-28 12:36:47.001835 7fa28b270700 -1
/build/ceph-12.1.4/src/msg/async/rdma/Infiniband.cc: In function 'int
Infiniband::MemoryManager::Cluster::fill(uint32_t)' thread 7fa28b270700
time 2017-08-28 12:36:46.997231
/build/ceph-12.1.4/src/msg/async/rdma/Infiniband.cc: 599: FAILED assert(m)
$ ceph -v
ceph version 12.1.4 (a5f84b37668fc8e03165aaf5cbb380c78e4deba4) luminous (rc)
In the osd logs I also see some of these:
-236> 2017-08-28 12:17:34.507315 7f7f4815ce00 -1 RDMAStack
RDMAStack!!! WARNING !!! For RDMA to work properly user memlock (ulimit
-l) must be big enough to allow large amount of registered memory. We
recommend setting this parameter to infinity
but, the memlock has been set to infinity:
$ ulimit -l
unlimited
Any suggestions..?
Best regards,
Jeroen Oldenhof
The Netherlands
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com