Re: Ceph RDMA module: OSD marks peers down wrongly

Haomai Wang <haomai@xxxxxxxx> · Fri, 18 Aug 2017 10:03:45 +0800

On Thu, Aug 17, 2017 at 5:50 PM, Jin Cai <caijin.laurence@xxxxxxxxx> wrote:
>
> Hi cephers,
>      I am testing the rdma module of ceph.
>      The test environment is as following:
>           Ceph version: 12.1.0
>           6 hosts and each host has 12 OSDs.
>
>       Error is injected into the cluster by hand:
>           1. kill all OSD daemons in one host
>            2. restart all the OSD daemons killed just now.
>
>        The problem is that  OSDs in other hosts cannot get heartbeat
> reply from each other and marked down by the monitor wrongly.
>        By analyse the log, I found that the OSDs from other hosts
> sents heartbeat to their peers, but the heartbeat could not be sent
> successfully because there doesn't have enough buffer:
>
>         RDMAConnectedSocketImpl operator() no enough buffers in worker
> 0x7fd839c18d00
>
>        The memory buffer in RDMADispatcher will be released by the
> RDMADispatcher::polling() function.
>        But when I killed all OSD daemons in one host and restarted
> then,  the ratio of memory buffer release became slow and finally the
> number of inflight chunks reached 1023(max value is 1024):
>
>        2017-08-15 20:15:42.383778 7fd82641b700 30 RDMAStack
> post_tx_buffer release 1 chunks, inflight 1023
>        2017-08-15 20:15:42.384151 7fd82641b700 30 RDMAStack
> post_tx_buffer release 1 chunks, inflight 1023
>        2017-08-15 20:15:42.538885 7fd82641b700 30 RDMAStack
> post_tx_buffer release 1 chunks, inflight 1023
>
>
>       I think the root cause is related to the memory buffer release
> when error is injected.
>       Do you have any ideas about this? Expect your response and
> thanks in advance.

tx buffer usage is simple, it won't hold by any userspace logic. I
haven't find related problem in my cluster.

maybe you can check your perf dump like "*tx*" counter, to see whether
something bad.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html