Hi Sagi, thanks a lot for the information. We are doing the right thing regarding the invalidation (your 2f122e4f5107), but we do use unsignalled sends and need to fix that. Please correct me if I'm wrong: The patches (b4b591c87f2b, b4b591c87f2b) fix the problem that if the ack from target is lost for some reason, the initiators HCA will resend it even after the request is completed. But doesn't the same problem persist also other way around: for the lost acks from client? I mean, target is did a send for the "read" IOs; client completed the request (after invalidation, refcount dropped to 0, etc), but the ack is not delivered to the HCA of the target, so the target will also resend it. This seems unfixable, since the client can't possible know if the server received his ack or not? Doesn't the problem go away, if rdma_conn_param.retry_count is just set to 0? Thanks for your help, Best, Danil. On Tue, Jul 9, 2019 at 11:27 PM Sagi Grimberg <sagi@xxxxxxxxxxx> wrote: > > > >> Thanks Jason for feedback. > >> Can you be more specific about "the invalidation model for MR was wrong" > > > > MR's must be invalidated before data is handed over to the block > > layer. It can't leave MRs open for access and then touch the memory > > the MR covers. > > Jason is referring to these fixes: > 2f122e4f5107 ("nvme-rdma: wait for local invalidation before completing > a request") > 4af7f7ff92a4 ("nvme-rdma: don't complete requests before a send work > request has completed") > b4b591c87f2b ("nvme-rdma: don't suppress send completions")