On Thu, Aug 17, 2017 at 5:50 PM, Jin Cai <caijin.laurence@xxxxxxxxx> wrote: > > Hi cephers, > I am testing the rdma module of ceph. > The test environment is as following: > Ceph version: 12.1.0 > 6 hosts and each host has 12 OSDs. > > Error is injected into the cluster by hand: > 1. kill all OSD daemons in one host > 2. restart all the OSD daemons killed just now. > > The problem is that OSDs in other hosts cannot get heartbeat > reply from each other and marked down by the monitor wrongly. > By analyse the log, I found that the OSDs from other hosts > sents heartbeat to their peers, but the heartbeat could not be sent > successfully because there doesn't have enough buffer: > > RDMAConnectedSocketImpl operator() no enough buffers in worker > 0x7fd839c18d00 > > The memory buffer in RDMADispatcher will be released by the > RDMADispatcher::polling() function. > But when I killed all OSD daemons in one host and restarted > then, the ratio of memory buffer release became slow and finally the > number of inflight chunks reached 1023(max value is 1024): > > 2017-08-15 20:15:42.383778 7fd82641b700 30 RDMAStack > post_tx_buffer release 1 chunks, inflight 1023 > 2017-08-15 20:15:42.384151 7fd82641b700 30 RDMAStack > post_tx_buffer release 1 chunks, inflight 1023 > 2017-08-15 20:15:42.538885 7fd82641b700 30 RDMAStack > post_tx_buffer release 1 chunks, inflight 1023 > > > I think the root cause is related to the memory buffer release > when error is injected. > Do you have any ideas about this? Expect your response and > thanks in advance. tx buffer usage is simple, it won't hold by any userspace logic. I haven't find related problem in my cluster. maybe you can check your perf dump like "*tx*" counter, to see whether something bad. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html