On 02/02/2023 11:45, Bob Pearson wrote: > On 2/1/23 09:38, Tom Talpey wrote: >> On 2/1/2023 6:06 AM, Daisuke Matsuda (Fujitsu) wrote: >>> On Sat, Jan 28, 2023 6:10 AM Bob Pearson wrote: >>>> >>>> Currently the rxe driver does not handle all cases of zero length >>>> rdma operations correctly. The client does not have to provide an >>>> rkey for zero length RDMA operations so the rkey provided may be >>>> invalid and should not be used to lookup an mr. >>>> >>>> This patch corrects the driver to ignore the provided rkey if the >>>> reth length is zero and make sure to set the mr to NULL. In read_reply() >>>> if length is zero rxe_recheck_mr() is not called. Warnings are added in >>>> the routines in rxe_mr.c to catch NULL MRs when the length is non-zero. >>>> >>>> Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx> >>>> --- >>> >>> When I applied this change, a testcase in rdma-core failed as shown below: >>> ====================================================================== >>> ERROR: test_qp_ex_rc_flush (tests.test_qpex.QpExTestCase) >>> ---------------------------------------------------------------------- >>> Traceback (most recent call last): >>> File "/root/rdma-core/tests/test_qpex.py", line 258, in test_qp_ex_rc_flush >>> raise PyverbsError(f'Unexpected {wc_status_to_str(wcs[0].status)}') >>> pyverbs.pyverbs_error.PyverbsError: Unexpected Remote access error >>> >>> ---------------------------------------------------------------------- >>> >>> In my opinion, your change makes sense within the range of traditional >>> RDMA operations, but conflicts with the new RDMA FLUSH operation. >>> Responder cannot access the target MR because of invalid rkey. The >>> root cause is written in IBA Annex A19, especially 'oA19-2'. >>> We thus cannot set qp->resp.rkey to 0 in qp_resp_from_reth(). >>> >>> Do you have anything to say about this? > Li Zhijian >>> >>> Thanks, >>> Daisuke Matsuda >> >> I'm confused too, Bob can you point to the section of the spec >> that allows the rkey to be zero? It's my understanding that >> a zero-length RDMA Read must always check for access, even >> though no data is actually fetched. That would not be possible >> without an rkey. >> >> Tom. >> > Tom, Daisuke, > > C9-88: For an HCA responder using Reliable Connection service, for > each zero-length RDMA READ or WRITE request, the R_Key shall not be > validated, even if the request includes Immediate data. > > Further I have seen the pyverbs test suite sending a totally bogus rkey on a zero length rdma read. That was the impetus for me looking at this. > > Daisuke has a different issue since flush is a different operation than read or write. > I need to look into what a zero length flush means. > Just took a look at the above FLUSH problem. It also exposed another bug in my flush code in rdma-core, PR: https://github.com/linux-rdma/rdma-core/pull/1307 when 'Selectivity Level (SEL)' is 'Memory Region', 0 length will be set in FETH, in this case, rkey should be valid and length should be ignored. Thanks Zhijian > Bob > >