Re: bug report for rdma_rxe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 25, 2022 at 08:40:30PM -0500, Bob Pearson wrote:
> On 4/25/22 17:58, Jason Gunthorpe wrote:

> Imagine a very long RDMA read operation that times out several times before finally
> getting all the data returned to the requester. Now imagine it is followed by some
> small RDMA ops to a different node that use fast reg MRs and are executed by the
> other node after receiving a small control message. E.g.
> 
> 	node1					node2					node3
> 
> 1:	Send: RDMA_READ(mr1 to node2)
> 						RDMA_READ_REPLY(mr1@node1, 1of2)
> 	ib_map_mr_sg(mr2a local)
> 	Send: IB_WR_REG_MR(mr2a local)
> 	Send: Control msg (mr2a to node3)
> 											Send: RDMA_WRITE(mr2a@node1)
> 	Send: IB_WR_LOCAL_INV(mr2a local)
> 	ib_update_fast_reg_key(mr2a->mr2b)
> 	ib_map_mr_sg(mr2b local)
> 	Send: Control msg (mr2b to node3)
> 											Send: RDMA_WRITE(mr2b@node1)
> 	Timeout: replay from 1 (w/o local ops)
> 	Send: RDMA_READ(mr1 to node2)
> 						RDMA_READ_REPLY(mr1@node1, 2of2)
> 	Send: Control msg (mr2a to node3)
> 											Send: RDMA_WRITE(mr2a@node1)
> 											FAILS because mr2a has been
> 											replaced by mr2b.
> On the other hand if we replay the REG_MR local command that won't work either
> because we didn't know to rerun the ib_map_mr_sg() call.

How did you get two destination nodes into an RC send queue? We have
SRQ not SSQ.

In any event, the above is a buggy ULP. The IB_WR_LOCAL_INV cannot be
posted until the CQ for Send with mr2a is received. (or possibly a
strong fence is used)

It follows the general rule that the ULP cannot alter the data memory
under a WQE until it sees the CQE for that WQE to know the NIC has
completed finished with the memory.

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux