Re: RDMA question for ceph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok, Thank you very much .   I will try to caontack them and  update the problem. And in the meantime , I will try to debug it by just seting up one mon and one osd.   Thanks again. 

On Mon, Jul 23, 2018 at 3:49 PM John Hearns <hearnsj@xxxxxxxxxxxxxx> wrote:
Will, looking at the logs which you sent, the connection cannot be set up.
I did try Googling for thse error messages, and I Could nto find anything definite.
As an aside QP = Queue Pair which is the structure set up to transfer information across an IB network.
Think of it like a TCP connection.

I think you should contact Mellanos support over this one. They are really good guys.



On 23 July 2018 at 08:14, Will Zhao <zhao6305@xxxxxxxxx> wrote:
Hi John:
   this is the information  ibv_devinfo   gives .

hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.35.5100
node_guid: e41d:2d03:0072:ed70
sys_image_guid: e41d:2d03:0072:ed73
vendor_id: 0x02c9
vendor_part_id: 4099
hw_ver: 0x1
board_id: MT_1090110019
phys_port_cnt: 2
Device ports:
port: 1
state: PORT_DOWN (1)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: InfiniBand

port: 2
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 2
port_lid: 11
port_lmc: 0x00
link_layer: InfiniBand


On Fri, Jul 20, 2018 at 7:09 PM John Hearns <hearnsj@xxxxxxxxxxxxxx> wrote:
What does ibv_devinfo  give you?


On 20 July 2018 at 12:13, Will Zhao <zhao6305@xxxxxxxxx> wrote:
Now I add the option "debug ms = 20/20" to ceph.conf global section to see more details about the errors, this time "ceph -s" shows thousands of lines, here are some log I paste from the results:


2018-07-20 16:12:49.994715 7f3a3be8e700 20 Infiniband verify_prereq ms_async_rdma_enable_hugepage value is: 0

2018-07-20 16:12:49.994723 7f3a3be8e700 20 Infiniband Infiniband constructing Infiniband...

2018-07-20 16:12:49.994748 7f3a3be8e700 20 RDMAStack RDMAStack constructing RDMAStack...

2018-07-20 16:12:49.994750 7f3a3be8e700 20 RDMAStack  creating RDMAStack:0x7f3a340b5448 with dispatcher:0x7f3a340b5558

2018-07-20 16:12:49.994924 7f3a3970e700  2 Event(0x7f3a340e2fe0 nevent=5000 time_id=1).set_owner idx=1 owner=139888048531200

2018-07-20 16:12:49.994990 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=1).create_file_event create event started fd=7 mask=1 original mask is 0

2018-07-20 16:12:49.994990 7f3a38f0d700  2 Event(0x7f3a34110850 nevent=5000 time_id=1).set_owner idx=2 owner=139888040138496

2018-07-20 16:12:49.994999 7f3a3970e700 20 EpollDriver.add_event add event fd=7 cur_mask=0 add_mask=1 to 6

2018-07-20 16:12:49.994991 7f3a39f0f700  2 Event(0x7f3a340b5770 nevent=5000 time_id=1).set_owner idx=0 owner=139888056923904

2018-07-20 16:12:49.995009 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=1).create_file_event create event end fd=7 mask=1 original mask is 1

2018-07-20 16:12:49.995011 7f3a38f0d700 20 Event(0x7f3a34110850 nevent=5000 time_id=1).create_file_event create event started fd=11 mask=1 original mask is 0

2018-07-20 16:12:49.995013 7f3a39f0f700 20 Event(0x7f3a340b5770 nevent=5000 time_id=1).create_file_event create event started fd=4 mask=1 original mask is 0

2018-07-20 16:12:49.995016 7f3a38f0d700 20 EpollDriver.add_event add event fd=11 cur_mask=0 add_mask=1 to 10

2018-07-20 16:12:49.995017 7f3a39f0f700 20 EpollDriver.add_event add event fd=4 cur_mask=0 add_mask=1 to 3

2018-07-20 16:12:49.995018 7f3a3970e700 10 stack operator() starting

2018-07-20 16:12:49.995022 7f3a38f0d700 20 Event(0x7f3a34110850 nevent=5000 time_id=1).create_file_event create event end fd=11 mask=1 original mask is 1

2018-07-20 16:12:49.995022 7f3a39f0f700 20 Event(0x7f3a340b5770 nevent=5000 time_id=1).create_file_event create event end fd=4 mask=1 original mask is 1

2018-07-20 16:12:49.995026 7f3a38f0d700 10 stack operator() starting

2018-07-20 16:12:49.995027 7f3a39f0f700 10 stack operator() starting

2018-07-20 16:12:49.995938 7f3a3be8e700 10 -- - ready -

2018-07-20 16:12:49.995946 7f3a3be8e700  1  Processor -- start

2018-07-20 16:12:49.995996 7f3a3be8e700  1 -- - start start

2018-07-20 16:12:49.996535 7f3a3be8e700 10 -- - create_connect 10.10.121.25:6789/0, creating connection and registering

2018-07-20 16:12:49.996574 7f3a3be8e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_NONE pgs=0 cs=0 l=1)._connect csq=0

2018-07-20 16:12:49.996594 7f3a3be8e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=1).wakeup

2018-07-20 16:12:49.996608 7f3a3be8e700 10 -- - get_connection mon.0 10.10.121.25:6789/0 new 0x7f3a34150270

2018-07-20 16:12:49.996666 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING

2018-07-20 16:12:49.996693 7f3a3be8e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).send_keepalive

2018-07-20 16:12:49.996700 7f3a3be8e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=1).wakeup

2018-07-20 16:12:49.996721 7f3a3be8e700  1 -- - --> 10.10.121.25:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- 0x7f3a3414e720 con 0

2018-07-20 16:12:49.996739 7f3a3be8e700 15 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).send_message inline write is denied, reschedule m=0x7f3a3414e720

2018-07-20 16:12:50.016836 7f3a3970e700  1 Infiniband Port using experimental verbs for gid

2018-07-20 16:12:50.017216 7f3a3970e700  1 Infiniband Port looking for local GID  of type 1

2018-07-20 16:12:50.017224 7f3a3970e700  1 Infiniband Port malformed or no GID supplied, using GID index 0

2018-07-20 16:12:50.017293 7f3a3970e700 10 Infiniband binding_port port 1 is not what we want. state: 1)

2018-07-20 16:12:50.017300 7f3a3970e700  1 Infiniband Port using experimental verbs for gid

2018-07-20 16:12:50.017611 7f3a3970e700  1 Infiniband Port looking for local GID  of type 1

2018-07-20 16:12:50.017617 7f3a3970e700  1 Infiniband Port malformed or no GID supplied, using GID index 0

2018-07-20 16:12:50.017664 7f3a3970e700  1 Infiniband binding_port found active port 2

2018-07-20 16:12:50.017680 7f3a3970e700  1 Infiniband init receive queue length is 4096 receive buffers

2018-07-20 16:12:50.017683 7f3a3970e700  1 Infiniband init assigning: 1024 send buffers

2018-07-20 16:12:50.017687 7f3a3970e700  1 Infiniband init device allow 4194303 completion entries

2018-07-20 16:12:50.332936 7f3a3970e700 20 Infiniband init started.

2018-07-20 16:12:50.332966 7f3a3970e700 20 Infiniband init started.

2018-07-20 16:12:50.334175 7f3a3970e700 20 Infiniband init successfully create cq=0x7f3a30008650

2018-07-20 16:12:50.335282 7f3a3970e700 20 Infiniband init successfully create cq=0x7f3a30008a00

2018-07-20 16:12:50.335327 7f3a3970e700 20 Infiniband init started.

2018-07-20 16:12:50.335467 7f3a255d5700 20 RDMAStack polling going to poll tx cq: 0x7f3a30008620 rx cq: 0x7f3a300089d0

2018-07-20 16:12:50.335496 7f3a255d5700 20 Infiniband rearm_notify started.

2018-07-20 16:12:50.335501 7f3a255d5700 20 Infiniband rearm_notify started.

2018-07-20 16:12:50.335502 7f3a3970e700 20 Infiniband init successfully create queue pair: qp=0x7f3a300092a0

2018-07-20 16:12:50.336033 7f3a3970e700 20 Infiniband init successfully change queue pair to INIT: qp=0x7f3a300092a0

2018-07-20 16:12:50.336049 7f3a3970e700 20  RDMAConnectedSocketImpl try_connect nonblock:1, nodelay:1, rbuf_size: 0

2018-07-20 16:12:50.336172 7f3a3970e700 20  RDMAConnectedSocketImpl try_connect tcp_fd: 18

2018-07-20 16:12:50.336186 7f3a3970e700 10 Infiniband send_msg sending: 11, 276404, 0, 0, fe80000000000000e41d2d030072ed72

2018-07-20 16:12:50.336212 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=1).create_file_event create event started fd=18 mask=1 original mask is 0

2018-07-20 16:12:50.336227 7f3a3970e700 20 EpollDriver.add_event add event fd=18 cur_mask=0 add_mask=1 to 6

2018-07-20 16:12:50.336234 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=1).create_file_event create event end fd=18 mask=1 original mask is 1

2018-07-20 16:12:50.336240 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=1).create_file_event create event started fd=17 mask=1 original mask is 0

2018-07-20 16:12:50.336243 7f3a3970e700 20 EpollDriver.add_event add event fd=17 cur_mask=0 add_mask=1 to 6

2018-07-20 16:12:50.336246 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=1).create_file_event create event end fd=17 mask=1 original mask is 1

2018-07-20 16:12:50.336249 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING

2018-07-20 16:12:50.336272 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._process_connection nonblock connect inprogress

2018-07-20 16:12:50.336306 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).handle_write

2018-07-20 16:12:50.336314 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).handle_write

2018-07-20 16:12:50.337500 7f3a3970e700 20  RDMAConnectedSocketImpl handle_connection QP: 276404 tcp_fd: 18 notify_fd: 17

2018-07-20 16:12:50.337524 7f3a3970e700  5 Infiniband recv_msg recevd: 0, 0, 0, 0, ??

2018-07-20 16:12:50.337528 7f3a3970e700 20  RDMAConnectedSocketImpl handle_connection peer msg :  < 0, 0, 0, 0>

2018-07-20 16:12:50.337533 7f3a3970e700 20  RDMAConnectedSocketImpl activate Choosing gid_index 0, sl 3

2018-07-20 16:12:50.337597 7f3a3970e700 20  RDMAConnectedSocketImpl activate transition to RTR state successfully.

2018-07-20 16:12:50.337637 7f3a3970e700 20  RDMAConnectedSocketImpl activate transition to RTS state successfully.

2018-07-20 16:12:50.337643 7f3a3970e700 20  RDMAConnectedSocketImpl activate QueuePair: 0x7f3a30009140 with qp:0x7f3a300092a0

2018-07-20 16:12:50.337645 7f3a3970e700 20  RDMAConnectedSocketImpl activate handle fake send, wake it up. QP: 276404

2018-07-20 16:12:50.337649 7f3a3970e700 20  RDMAConnectedSocketImpl submit we need 0 bytes. iov size: 0

2018-07-20 16:12:50.337655 7f3a3970e700 10 Infiniband send_msg sending: 11, 276404, 0, 0, fe80000000000000e41d2d030072ed72

2018-07-20 16:12:50.337670 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING_RE

2018-07-20 16:12:50.337678 7f3a3970e700 20 EpollDriver.del_event del event fd=17 cur_mask=1 delmask=2 to 6

2018-07-20 16:12:50.337680 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._process_connection connect successfully, ready to send banner

2018-07-20 16:12:50.337695 7f3a3970e700 20  RDMAConnectedSocketImpl send QP: 276404

2018-07-20 16:12:50.337699 7f3a3970e700 20  RDMAConnectedSocketImpl submit we need 9 bytes. iov size: 1

2018-07-20 16:12:50.337704 7f3a3970e700 20  RDMAConnectedSocketImpl submit left bytes: 0 in buffers 0 tx chunks 1

2018-07-20 16:12:50.337706 7f3a3970e700 20  RDMAConnectedSocketImpl post_work_request QP: 276404 0x7f3a30011890

2018-07-20 16:12:50.337713 7f3a3970e700 20  RDMAConnectedSocketImpl post_work_request qp state is IBV_QPS_RTS

2018-07-20 16:12:50.337793 7f3a3970e700 20  RDMAConnectedSocketImpl submit finished sending 9 bytes.

2018-07-20 16:12:50.337796 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._try_send sent bytes 9 remaining bytes 0

2018-07-20 16:12:50.337802 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=1).create_file_event create event started fd=17 mask=2 original mask is 1

2018-07-20 16:12:50.337806 7f3a3970e700 20 EpollDriver.add_event add event fd=17 cur_mask=1 add_mask=2 to 6

2018-07-20 16:12:50.337807 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=1).create_file_event create event end fd=17 mask=2 original mask is 3

2018-07-20 16:12:50.337810 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._process_connection connect write banner done: 10.10.121.25:6789/0

2018-07-20 16:12:50.337816 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING_RE

2018-07-20 16:12:50.337824 7f3a3970e700 20  RDMAConnectedSocketImpl read notify_fd : 1 in 276404 r = 8

2018-07-20 16:12:50.337828 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).handle_write

2018-07-20 16:12:50.337831 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._try_send sent bytes 0 remaining bytes 0

2018-07-20 16:12:54.409747 7f3a255d5700 20 RDMAStack polling got tx cq event.

2018-07-20 16:12:54.409773 7f3a255d5700 20 RDMAStack polling tx completion queue got 1 responses.

2018-07-20 16:12:54.409780 7f3a255d5700  1 RDMAStack handle_tx_event connection between server and client not working. Disconnect this now

2018-07-20 16:12:54.409787 7f3a255d5700  1  RDMAConnectedSocketImpl fault tcp fd 18

2018-07-20 16:12:54.409814 7f3a255d5700 20 Infiniband rearm_notify started.

2018-07-20 16:12:54.409816 7f3a255d5700 20 Infiniband rearm_notify started.

2018-07-20 16:12:54.409841 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY

2018-07-20 16:12:54.409876 7f3a3970e700 20  RDMAConnectedSocketImpl read notify_fd : 1 in 276404 r = 8

2018-07-20 16:12:54.409932 7f3a3970e700  1 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).read_bulk reading from fd=17 : Unknown error -104

2018-07-20 16:12:54.409953 7f3a3970e700  1 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).read_until read failed

2018-07-20 16:12:54.409969 7f3a3970e700  1 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._process_connection read banner and identify addresses failed

2018-07-20 16:12:54.409984 7f3a3970e700 20 EpollDriver.del_event del event fd=17 cur_mask=3 delmask=3 to 6

2018-07-20 16:12:54.409995 7f3a3970e700 20  RDMAConnectedSocketImpl ~RDMAConnectedSocketImpl destruct.

2018-07-20 16:12:54.410025 7f3a3970e700 20 EpollDriver.del_event del event fd=18 cur_mask=1 delmask=1 to 6

2018-07-20 16:12:54.410090 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).fault waiting 0.200000

2018-07-20 16:12:54.610395 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING

2018-07-20 16:12:54.610432 7f3a3970e700 20 Infiniband init started.

2018-07-20 16:12:54.610588 7f3a3970e700 20 Infiniband init successfully create queue pair: qp=0x7f3a3002f900

2018-07-20 16:12:54.611676 7f3a3970e700 20 Infiniband init successfully change queue pair to INIT: qp=0x7f3a3002f900

2018-07-20 16:12:54.611698 7f3a3970e700 20  RDMAConnectedSocketImpl try_connect nonblock:1, nodelay:1, rbuf_size: 0

2018-07-20 16:12:54.611785 7f3a3970e700 20  RDMAConnectedSocketImpl try_connect tcp_fd: 18

2018-07-20 16:12:54.611813 7f3a3970e700 10 Infiniband send_msg sending: 11, 276420, 2116118, 0, fe80000000000000e41d2d030072ed72

2018-07-20 16:12:54.611858 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=2).create_file_event create event started fd=18 mask=1 original mask is 0

2018-07-20 16:12:54.611866 7f3a3970e700 20 EpollDriver.add_event add event fd=18 cur_mask=0 add_mask=1 to 6

2018-07-20 16:12:54.611872 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=2).create_file_event create event end fd=18 mask=1 original mask is 1

2018-07-20 16:12:54.611875 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=2).create_file_event create event started fd=17 mask=1 original mask is 0

2018-07-20 16:12:54.611877 7f3a3970e700 20 EpollDriver.add_event add event fd=17 cur_mask=0 add_mask=1 to 6

2018-07-20 16:12:54.611880 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=2).create_file_event create event end fd=17 mask=1 original mask is 1

2018-07-20 16:12:54.611882 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING

2018-07-20 16:12:54.611894 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._process_connection nonblock connect inprogress

2018-07-20 16:12:54.612975 7f3a3970e700 20  RDMAConnectedSocketImpl handle_connection QP: 276420 tcp_fd: 18 notify_fd: 17

2018-07-20 16:12:54.613001 7f3a3970e700  5 Infiniband recv_msg recevd: 0, 0, 0, 0, ??

2018-07-20 16:12:54.613004 7f3a3970e700 20  RDMAConnectedSocketImpl handle_connection peer msg :  < 0, 0, 0, 0>

2018-07-20 16:12:54.613007 7f3a3970e700 20  RDMAConnectedSocketImpl activate Choosing gid_index 0, sl 3

2018-07-20 16:12:54.613104 7f3a3970e700 20  RDMAConnectedSocketImpl activate transition to RTR state successfully.

2018-07-20 16:12:54.613221 7f3a3970e700 20  RDMAConnectedSocketImpl activate transition to RTS state successfully.

2018-07-20 16:12:54.613243 7f3a3970e700 20  RDMAConnectedSocketImpl activate QueuePair: 0x7f3a3002e6f0 with qp:0x7f3a3002f900

2018-07-20 16:12:54.613249 7f3a3970e700 20  RDMAConnectedSocketImpl activate handle fake send, wake it up. QP: 276420

2018-07-20 16:12:54.613252 7f3a3970e700 20  RDMAConnectedSocketImpl submit we need 0 bytes. iov size: 0

2018-07-20 16:12:54.613267 7f3a3970e700 10 Infiniband send_msg sending: 11, 276420, 2116118, 0, fe80000000000000e41d2d030072ed72

2018-07-20 16:12:54.613294 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING_RE

2018-07-20 16:12:54.613309 7f3a3970e700 20 EpollDriver.del_event del event fd=17 cur_mask=1 delmask=2 to 6

2018-07-20 16:12:54.613313 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._process_connection connect successfully, ready to send banner

2018-07-20 16:12:54.613338 7f3a3970e700 20  RDMAConnectedSocketImpl send QP: 276420

2018-07-20 16:12:54.613342 7f3a3970e700 20  RDMAConnectedSocketImpl submit we need 9 bytes. iov size: 1

2018-07-20 16:12:54.613350 7f3a3970e700 20  RDMAConnectedSocketImpl submit left bytes: 0 in buffers 0 tx chunks 1

2018-07-20 16:12:54.613353 7f3a3970e700 20  RDMAConnectedSocketImpl post_work_request QP: 276420 0x7f3a30011890

2018-07-20 16:12:54.613358 7f3a3970e700 20  RDMAConnectedSocketImpl post_work_request qp state is IBV_QPS_RTS

2018-07-20 16:12:54.613451 7f3a3970e700 20  RDMAConnectedSocketImpl submit finished sending 9 bytes.

2018-07-20 16:12:54.613458 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._try_send sent bytes 9 remaining bytes 0

2018-07-20 16:12:54.613473 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=2).create_file_event create event started fd=17 mask=2 original mask is 1

2018-07-20 16:12:54.613477 7f3a3970e700 20 EpollDriver.add_event add event fd=17 cur_mask=1 add_mask=2 to 6

2018-07-20 16:12:54.613493 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=2).create_file_event create event end fd=17 mask=2 original mask is 3

2018-07-20 16:12:54.613497 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._process_connection connect write banner done: 10.10.121.25:6789/0

2018-07-20 16:12:54.613511 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING_RE

2018-07-20 16:12:54.613519 7f3a3970e700 20  RDMAConnectedSocketImpl read notify_fd : 1 in 276420 r = 8

2018-07-20 16:12:54.613529 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).handle_write

2018-07-20 16:12:54.613537 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._try_send sent bytes 0 remaining bytes 0

2018-07-20 16:12:58.738443 7f3a255d5700 20 RDMAStack polling got tx cq event.

2018-07-20 16:12:58.738459 7f3a255d5700 20 RDMAStack polling tx completion queue got 1 responses.

2018-07-20 16:12:58.738464 7f3a255d5700  1 RDMAStack handle_tx_event connection between server and client not working. Disconnect this now

2018-07-20 16:12:58.738468 7f3a255d5700  1  RDMAConnectedSocketImpl fault tcp fd 18

2018-07-20 16:12:58.738477 7f3a255d5700 10 RDMAStack polling finally delete qp=0x7f3a30009140

2018-07-20 16:12:58.738481 7f3a255d5700 20 Infiniband ~QueuePair destroy qp=0x7f3a300092a0

2018-07-20 16:12:58.738528 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY

2018-07-20 16:12:58.738555 7f3a3970e700 20  RDMAConnectedSocketImpl read notify_fd : 1 in 276420 r = 8

2018-07-20 16:12:58.738560 7f3a3970e700  1 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).read_bulk reading from fd=17 : Unknown error -104

2018-07-20 16:12:58.738608 7f3a3970e700  1 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).read_until read failed

2018-07-20 16:12:58.738614 7f3a3970e700  1 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._process_connection read banner and identify addresses failed

2018-07-20 16:12:58.738622 7f3a3970e700 20 EpollDriver.del_event del event fd=17 cur_mask=3 delmask=3 to 6

2018-07-20 16:12:58.738631 7f3a3970e700 20  RDMAConnectedSocketImpl ~RDMAConnectedSocketImpl destruct.

2018-07-20 16:12:58.738633 7f3a3970e700 20 EpollDriver.del_event del event fd=18 cur_mask=1 delmask=1 to 6

2018-07-20 16:12:58.739731 7f3a255d5700 10 RDMAStack handle_async_event event associated qp=0x7f3a3002f900 evt: last WQE reached

2018-07-20 16:12:58.739746 7f3a255d5700  1 RDMAStack handle_async_event it's not forwardly stopped by us, reenable=0x7f3a3002f690

2018-07-20 16:12:58.739749 7f3a255d5700  1  RDMAConnectedSocketImpl fault tcp fd 18

2018-07-20 16:12:58.739756 7f3a255d5700 20 Infiniband rearm_notify started.

2018-07-20 16:12:58.739757 7f3a255d5700 20 Infiniband rearm_notify started.

2018-07-20 16:12:58.739759 7f3a255d5700 10 RDMAStack polling finally delete qp=0x7f3a3002e6f0

2018-07-20 16:12:58.739762 7f3a255d5700 20 Infiniband ~QueuePair destroy qp=0x7f3a3002f900

2018-07-20 16:12:58.740726 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).fault waiting 0.400000

2018-07-20 16:12:59.141377 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING

2018-07-20 16:12:59.141410 7f3a3970e700 20 Infiniband init started.

2018-07-20 16:12:59.141546 7f3a3970e700 20 Infiniband init successfully create queue pair: qp=0x7f3a30027970

2018-07-20 16:12:59.142182 7f3a3970e700 20 Infiniband init successfully change queue pair to INIT: qp=0x7f3a30027970

2018-07-20 16:12:59.142203 7f3a3970e700 20  RDMAConnectedSocketImpl try_connect nonblock:1, nodelay:1, rbuf_size: 0

2018-07-20 16:12:59.142291 7f3a3970e700 20  RDMAConnectedSocketImpl try_connect tcp_fd: 18

2018-07-20 16:12:59.142307 7f3a3970e700 10 Infiniband send_msg sending: 11, 276438, 5515815, 0, fe80000000000000e41d2d030072ed72

2018-07-20 16:12:59.142331 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event started fd=18 mask=1 original mask is 0

2018-07-20 16:12:59.142350 7f3a3970e700 20 EpollDriver.add_event add event fd=18 cur_mask=0 add_mask=1 to 6

2018-07-20 16:12:59.142356 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event end fd=18 mask=1 original mask is 1

2018-07-20 16:12:59.142360 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event started fd=17 mask=1 original mask is 0

2018-07-20 16:12:59.142376 7f3a3970e700 20 EpollDriver.add_event add event fd=17 cur_mask=0 add_mask=1 to 6

2018-07-20 16:12:59.142380 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event end fd=17 mask=1 original mask is 1

2018-07-20 16:12:59.142383 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING

2018-07-20 16:12:59.142395 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._process_connection nonblock connect inprogress

2018-07-20 16:12:59.143312 7f3a3970e700 20  RDMAConnectedSocketImpl handle_connection QP: 276438 tcp_fd: 18 notify_fd: 17

2018-07-20 16:12:59.143337 7f3a3970e700  5 Infiniband recv_msg recevd: 0, 2134519808, 2315255808, 2134519808, ??

2018-07-20 16:12:59.143341 7f3a3970e700 20  RDMAConnectedSocketImpl handle_connection peer msg :  < 2134519808, 2315255808, 0, 2134519808>

2018-07-20 16:12:59.143346 7f3a3970e700 20  RDMAConnectedSocketImpl activate Choosing gid_index 0, sl 3

2018-07-20 16:12:59.143427 7f3a3970e700 20  RDMAConnectedSocketImpl activate transition to RTR state successfully.

2018-07-20 16:12:59.143495 7f3a3970e700 20  RDMAConnectedSocketImpl activate transition to RTS state successfully.

2018-07-20 16:12:59.143504 7f3a3970e700 20  RDMAConnectedSocketImpl activate QueuePair: 0x7f3a3002e6f0 with qp:0x7f3a30027970

2018-07-20 16:12:59.143507 7f3a3970e700 20  RDMAConnectedSocketImpl activate handle fake send, wake it up. QP: 276438

2018-07-20 16:12:59.143510 7f3a3970e700 20  RDMAConnectedSocketImpl submit we need 0 bytes. iov size: 0

2018-07-20 16:12:59.143529 7f3a3970e700 10 Infiniband send_msg sending: 11, 276438, 5515815, 2134519808, fe80000000000000e41d2d030072ed72

2018-07-20 16:12:59.143569 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING_RE

2018-07-20 16:12:59.143585 7f3a3970e700 20 EpollDriver.del_event del event fd=17 cur_mask=1 delmask=2 to 6

2018-07-20 16:12:59.143589 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._process_connection connect successfully, ready to send banner

2018-07-20 16:12:59.143611 7f3a3970e700 20  RDMAConnectedSocketImpl send QP: 276438

2018-07-20 16:12:59.143613 7f3a3970e700 20  RDMAConnectedSocketImpl submit we need 9 bytes. iov size: 1

2018-07-20 16:12:59.143621 7f3a3970e700 20  RDMAConnectedSocketImpl submit left bytes: 0 in buffers 0 tx chunks 1

2018-07-20 16:12:59.143623 7f3a3970e700 20  RDMAConnectedSocketImpl post_work_request QP: 276438 0x7f3a30011890

2018-07-20 16:12:59.143628 7f3a3970e700 20  RDMAConnectedSocketImpl post_work_request qp state is IBV_QPS_RTS

2018-07-20 16:12:59.143713 7f3a3970e700 20  RDMAConnectedSocketImpl submit finished sending 9 bytes.

2018-07-20 16:12:59.143721 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._try_send sent bytes 9 remaining bytes 0

2018-07-20 16:12:59.143738 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event started fd=17 mask=2 original mask is 1

2018-07-20 16:12:59.143744 7f3a3970e700 20 EpollDriver.add_event add event fd=17 cur_mask=1 add_mask=2 to 6

2018-07-20 16:12:59.143748 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event end fd=17 mask=2 original mask is 3

2018-07-20 16:12:59.143751 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._process_connection connect write banner done: 10.10.121.25:6789/0

2018-07-20 16:12:59.143764 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING_RE

2018-07-20 16:12:59.143772 7f3a3970e700 20  RDMAConnectedSocketImpl read notify_fd : 1 in 276438 r = 8

2018-07-20 16:12:59.143782 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).handle_write

2018-07-20 16:12:59.143794 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._try_send sent bytes 0 remaining bytes 0

2018-07-20 16:12:59.996654 7f3a267fc700  1 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).mark_down

2018-07-20 16:12:59.996675 7f3a267fc700  2 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._stop

2018-07-20 16:12:59.996686 7f3a267fc700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).discard_out_queue started

2018-07-20 16:12:59.996697 7f3a267fc700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f3a34150270 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).discard_out_queue discard 0x7f3a3414e720

2018-07-20 16:12:59.996715 7f3a267fc700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).wakeup

2018-07-20 16:12:59.996752 7f3a267fc700 10 -- - create_connect 10.10.121.25:6789/0, creating connection and registering

2018-07-20 16:12:59.996783 7f3a267fc700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c80013d0 :-1 s=STATE_NONE pgs=0 cs=0 l=1)._connect csq=0

2018-07-20 16:12:59.996814 7f3a267fc700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).wakeup

2018-07-20 16:12:59.996782 7f3a3970e700 20 EpollDriver.del_event del event fd=17 cur_mask=3 delmask=3 to 6

2018-07-20 16:12:59.996821 7f3a267fc700 10 -- - get_connection mon.0 10.10.121.25:6789/0 new 0x7f39c80013d0

2018-07-20 16:12:59.996839 7f3a3970e700 20  RDMAConnectedSocketImpl ~RDMAConnectedSocketImpl destruct.

2018-07-20 16:12:59.996843 7f3a3970e700 20 EpollDriver.del_event del event fd=18 cur_mask=1 delmask=1 to 6

2018-07-20 16:12:59.996844 7f3a267fc700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c80013d0 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).send_keepalive

2018-07-20 16:12:59.996861 7f3a267fc700  1 -- - --> 10.10.121.25:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- 0x7f39c8006040 con 0

2018-07-20 16:12:59.996871 7f3a267fc700 15 -- - >> 10.10.121.25:6789/0 conn(0x7f39c80013d0 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).send_message inline write is denied, reschedule m=0x7f39c8006040

2018-07-20 16:12:59.996919 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f39c80013d0 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING

 ...


2018-07-20 16:17:33.022882 7f3a3970e700 20 Infiniband init started.

2018-07-20 16:17:33.022986 7f3a3970e700 20 Infiniband init successfully create queue pair: qp=0x7f3a30000fb0

2018-07-20 16:17:33.023642 7f3a3970e700 20 Infiniband init successfully change queue pair to INIT: qp=0x7f3a30000fb0

2018-07-20 16:17:33.023656 7f3a3970e700 20  RDMAConnectedSocketImpl try_connect nonblock:1, nodelay:1, rbuf_size: 0

2018-07-20 16:17:33.023732 7f3a3970e700 20  RDMAConnectedSocketImpl try_connect tcp_fd: 18

2018-07-20 16:17:33.023746 7f3a3970e700 10 Infiniband send_msg sending: 11, 277532, 7190722, 0, fe80000000000000e41d2d030072ed72

2018-07-20 16:17:33.023768 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event started fd=18 mask=1 original mask is 0

2018-07-20 16:17:33.023793 7f3a3970e700 20 EpollDriver.add_event add event fd=18 cur_mask=0 add_mask=1 to 6

2018-07-20 16:17:33.023798 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event end fd=18 mask=1 original mask is 1

2018-07-20 16:17:33.023801 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event started fd=17 mask=1 original mask is 0

2018-07-20 16:17:33.023804 7f3a3970e700 20 EpollDriver.add_event add event fd=17 cur_mask=0 add_mask=1 to 6

2018-07-20 16:17:33.023807 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event end fd=17 mask=1 original mask is 1

2018-07-20 16:17:33.023810 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING

2018-07-20 16:17:33.023820 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._process_connection nonblock connect inprogress

2018-07-20 16:17:33.023836 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).handle_write

2018-07-20 16:17:33.023842 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).handle_write

2018-07-20 16:17:33.024805 7f3a3970e700 20  RDMAConnectedSocketImpl handle_connection QP: 277532 tcp_fd: 18 notify_fd: 17

2018-07-20 16:17:33.024832 7f3a3970e700  5 Infiniband recv_msg recevd: 0, 1684090482, 1986355044, 544501349, ??

2018-07-20 16:17:33.024835 7f3a3970e700 20  RDMAConnectedSocketImpl handle_connection peer msg :  < 1684090482, 1986355044, 0, 544501349>

2018-07-20 16:17:33.024838 7f3a3970e700 20  RDMAConnectedSocketImpl activate Choosing gid_index 0, sl 3

2018-07-20 16:17:33.024932 7f3a3970e700 20  RDMAConnectedSocketImpl activate transition to RTR state successfully.

2018-07-20 16:17:33.024990 7f3a3970e700 20  RDMAConnectedSocketImpl activate transition to RTS state successfully.

2018-07-20 16:17:33.024996 7f3a3970e700 20  RDMAConnectedSocketImpl activate QueuePair: 0x7f3a3002f7a0 with qp:0x7f3a30000fb0

2018-07-20 16:17:33.025010 7f3a3970e700 20  RDMAConnectedSocketImpl activate handle fake send, wake it up. QP: 277532

2018-07-20 16:17:33.025013 7f3a3970e700 20  RDMAConnectedSocketImpl submit we need 0 bytes. iov size: 0

2018-07-20 16:17:33.025023 7f3a3970e700 10 Infiniband send_msg sending: 11, 277532, 7190722, 1684090482, fe80000000000000e41d2d030072ed72

2018-07-20 16:17:33.025050 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING_RE

2018-07-20 16:17:33.025065 7f3a3970e700 20 EpollDriver.del_event del event fd=17 cur_mask=1 delmask=2 to 6

2018-07-20 16:17:33.025071 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._process_connection connect successfully, ready to send banner

2018-07-20 16:17:33.025083 7f3a3970e700 20  RDMAConnectedSocketImpl send QP: 277532

2018-07-20 16:17:33.025085 7f3a3970e700 20  RDMAConnectedSocketImpl submit we need 9 bytes. iov size: 1

2018-07-20 16:17:33.025090 7f3a3970e700 20  RDMAConnectedSocketImpl submit left bytes: 0 in buffers 0 tx chunks 1

2018-07-20 16:17:33.025093 7f3a3970e700 20  RDMAConnectedSocketImpl post_work_request QP: 277532 0x7f3a30011850

2018-07-20 16:17:33.025097 7f3a3970e700 20  RDMAConnectedSocketImpl post_work_request qp state is IBV_QPS_RTS

2018-07-20 16:17:33.025172 7f3a3970e700 20  RDMAConnectedSocketImpl submit finished sending 9 bytes.

2018-07-20 16:17:33.025178 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1)._try_send sent bytes 9 remaining bytes 0

2018-07-20 16:17:33.025186 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event started fd=17 mask=2 original mask is 1

2018-07-20 16:17:33.025189 7f3a3970e700 20 EpollDriver.add_event add event fd=17 cur_mask=1 add_mask=2 to 6

2018-07-20 16:17:33.025193 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event end fd=17 mask=2 original mask is 3

2018-07-20 16:17:33.025195 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._process_connection connect write banner done: 10.10.121.25:6789/0

2018-07-20 16:17:33.025203 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING_RE

2018-07-20 16:17:33.025210 7f3a3970e700 20  RDMAConnectedSocketImpl read notify_fd : 1 in 277532 r = 8

2018-07-20 16:17:33.025218 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).handle_write

2018-07-20 16:17:33.025223 7f3a3970e700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._try_send sent bytes 0 remaining bytes 0

2018-07-20 16:17:34.140706 7f3a255d5700 20 RDMAStack polling got tx cq event.

2018-07-20 16:17:34.140722 7f3a255d5700 20 RDMAStack polling tx completion queue got 2 responses.

2018-07-20 16:17:34.140725 7f3a255d5700  1 RDMAStack handle_tx_event connection between server and client not working. Disconnect this now

2018-07-20 16:17:34.140728 7f3a255d5700  1 RDMAStack handle_tx_event missing qp_num=277520 discard event

2018-07-20 16:17:34.140731 7f3a255d5700  1 RDMAStack handle_tx_event Work Request Flushed Error: this connection's qp=277520 should be down while this WR=139887890327280 still in flight.

2018-07-20 16:17:34.140733 7f3a255d5700  1 RDMAStack handle_tx_event missing qp_num=277520 discard event

2018-07-20 16:17:34.140734 7f3a255d5700  1 RDMAStack handle_tx_event sending of the disconnect msg completed

2018-07-20 16:17:34.140738 7f3a255d5700 10 RDMAStack polling finally delete qp=0x7f3a3002e6f0

2018-07-20 16:17:34.140740 7f3a255d5700 20 Infiniband ~QueuePair destroy qp=0x7f3a3003ac10

2018-07-20 16:17:34.141496 7f3a255d5700 20 Infiniband rearm_notify started.

2018-07-20 16:17:34.141508 7f3a255d5700 20 Infiniband rearm_notify started.

2018-07-20 16:17:36.022943 7f3a267fc700  1 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).mark_down

2018-07-20 16:17:36.022955 7f3a267fc700  2 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1)._stop

2018-07-20 16:17:36.022962 7f3a267fc700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).discard_out_queue started

2018-07-20 16:17:36.022967 7f3a267fc700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f39c8006a60 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=1).discard_out_queue discard 0x7f39c80031b0

2018-07-20 16:17:36.022979 7f3a267fc700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).wakeup

2018-07-20 16:17:36.023003 7f3a267fc700 10 -- - create_connect 10.10.121.25:6789/0, creating connection and registering

2018-07-20 16:17:36.023019 7f3a267fc700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c800b750 :-1 s=STATE_NONE pgs=0 cs=0 l=1)._connect csq=0

2018-07-20 16:17:36.023026 7f3a267fc700 10 -- - get_connection mon.0 10.10.121.25:6789/0 new 0x7f39c800b750

2018-07-20 16:17:36.023041 7f3a267fc700 10 -- - >> 10.10.121.25:6789/0 conn(0x7f39c800b750 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).send_keepalive

2018-07-20 16:17:36.023047 7f3a267fc700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).wakeup

2018-07-20 16:17:36.023041 7f3a3970e700 20 EpollDriver.del_event del event fd=17 cur_mask=3 delmask=3 to 6

2018-07-20 16:17:36.023057 7f3a267fc700  1 -- - --> 10.10.121.25:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- 0x7f39c80031b0 con 0

2018-07-20 16:17:36.023060 7f3a3970e700 20  RDMAConnectedSocketImpl ~RDMAConnectedSocketImpl destruct.

2018-07-20 16:17:36.023067 7f3a3970e700 20 EpollDriver.del_event del event fd=18 cur_mask=1 delmask=1 to 6

2018-07-20 16:17:36.023068 7f3a267fc700 15 -- - >> 10.10.121.25:6789/0 conn(0x7f39c800b750 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).send_message inline write is denied, reschedule m=0x7f39c80031b0

2018-07-20 16:17:36.023127 7f3a3970e700 20 -- - >> 10.10.121.25:6789/0 conn(0x7f39c800b750 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=1).process prev state is STATE_CONNECTING

2018-07-20 16:17:36.023152 7f3a3970e700 20 Infiniband init started.

2018-07-20 16:17:36.023272 7f3a3970e700 20 Infiniband init successfully create queue pair: qp=0x7f3a3002d600

2018-07-20 16:17:36.023923 7f3a3970e700 20 Infiniband init successfully change queue pair to INIT: qp=0x7f3a3002d600

2018-07-20 16:17:36.023938 7f3a3970e700 20  RDMAConnectedSocketImpl try_connect nonblock:1, nodelay:1, rbuf_size: 0

2018-07-20 16:17:36.024024 7f3a3970e700 20  RDMAConnectedSocketImpl try_connect tcp_fd: 18

2018-07-20 16:17:36.024037 7f3a3970e700 10 Infiniband send_msg sending: 11, 277544, 8220122, 0, fe80000000000000e41d2d030072ed72

2018-07-20 16:17:36.024060 7f3a3970e700 20 Event(0x7f3a340e2fe0 nevent=5000 time_id=3).create_file_event create event started fd=18 mask=1 original mask is 0
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux