Sagi Grimberg <sagi@xxxxxxxxxxx> 于2019年7月12日周五 下午9:40写道: > > > > Hi Sagi, > > > >>>> Another question, from what I understand from the code, the client > >>>> always rdma_writes data on writes (with imm) from a remote pool of > >>>> server buffers dedicated to it. Essentially all writes are immediate (no > >>>> rdma reads ever). How is that different than using send wrs to a set of > >>>> pre-posted recv buffers (like all others are doing)? Is it faster? > >>> At the very beginning of the project we did some measurements and saw, > >>> that it is faster. I'm not sure if this is still true > >> > >> Its not significantly faster (can't imagine why it would be). > >> What could make a difference is probably the fact that you never > >> do rdma reads for I/O writes which might be better. Also perhaps the > >> fact that you normally don't wait for send completions before completing > >> I/O (which is broken), and the fact that you batch recv operations. > > > > I don't know how do you come to the conclusion we don't wait for send > > completion before completing IO. > > > > We do chain wr on successfull read request from server, see funtion > > rdma_write_sg, > > I was referring to the client side Hi Sagi, I checked the 3 commits you mentioned in earlier thread again, I now get your point. You meant the behavior following commits try to fix. 4af7f7ff92a4 ("nvme-rdma: don't complete requests before a send work request has completed") b4b591c87f2b ("nvme-rdma: don't suppress send completions") In this sense, ibtrs client side are not waiting for the completions for RDMA WRITE WR to finish. But we did it right for local invalidation. I checked SRP/iser, they are not even wait for local invalidation, no signal flag set. If it's a problem, we should fix them too, maybe more. My question is do you see the behavior (HCA retry send due to drop ack ) in the field, is it possible to reproduce? Thanks, Jack