Re: [PATCH v4 00/25] InfiniBand Transport (IBTRS) and Network Block Device (IBNBD)

Jinpu Wang <jinpuwang@xxxxxxxxx> · Mon, 15 Jul 2019 13:21:47 +0200

Sagi Grimberg <sagi@xxxxxxxxxxx> 于2019年7月12日周五 下午9:40写道：
>
>
> > Hi Sagi,
> >
> >>>> Another question, from what I understand from the code, the client
> >>>> always rdma_writes data on writes (with imm) from a remote pool of
> >>>> server buffers dedicated to it. Essentially all writes are immediate (no
> >>>> rdma reads ever). How is that different than using send wrs to a set of
> >>>> pre-posted recv buffers (like all others are doing)? Is it faster?
> >>> At the very beginning of the project we did some measurements and saw,
> >>> that it is faster. I'm not sure if this is still true
> >>
> >> Its not significantly faster (can't imagine why it would be).
> >> What could make a difference is probably the fact that you never
> >> do rdma reads for I/O writes which might be better. Also perhaps the
> >> fact that you normally don't wait for send completions before completing
> >> I/O (which is broken), and the fact that you batch recv operations.
> >
> > I don't know how do you come to the conclusion we don't wait for send
> > completion before completing IO.
> >
> > We do chain wr on successfull read request from server, see funtion
> > rdma_write_sg,
>
> I was referring to the client side
Hi Sagi,

I checked the 3 commits you mentioned in earlier thread again, I now
get your point.
You meant the behavior following commits try to fix.

4af7f7ff92a4 ("nvme-rdma: don't complete requests before a send work
request has completed")
b4b591c87f2b ("nvme-rdma: don't suppress send completions")

In this sense, ibtrs client side are not waiting for the completions
for RDMA WRITE WR to finish.
But we did it right for local invalidation.

I checked SRP/iser, they are not even wait for local invalidation, no
signal flag set.

If it's a problem, we should fix them too, maybe more.

My question is do you see the behavior (HCA retry send due to drop ack
) in the field,
is it possible to reproduce?

Thanks,
Jack