On further reflection I realize I did not understand correctly the user/kernel API issue correctly. I was assuming that the user application should continue to run but that we could require re-compiling rdma-core. If we require that old rdma-core binaries run on newer kernels then the 40 bytes is an issue. I always recompiled rdma-core and didn't test running with old binaries. Fortunately there is an easy fix. The flags field in the earlier rxe mw version had one bit in it but the new version dropped that and I never went back and removed the field. Dropping the flags field doesn't break anything but lets the mw struct fit in the wr union without extending it. I will fix, retest and resubmit. Bob -----Original Message----- From: Zhu Yanjun <zyjzyj2000@xxxxxxxxx> Sent: Tuesday, May 25, 2021 10:00 AM To: Pearson, Robert B <robert.pearson2@xxxxxxx> Cc: Pearson, Robert B <rpearsonhpe@xxxxxxxxx>; Jason Gunthorpe <jgg@xxxxxxxxxx>; RDMA mailing list <linux-rdma@xxxxxxxxxxxxxxx> Subject: Re: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory windows On Tue, May 25, 2021 at 1:27 PM Pearson, Robert B <robert.pearson2@xxxxxxx> wrote: > > There's nothing to change. There is no problem. Just get the headers sync'ed. > If that doesn't fix your issues your tree has gotten corrupted somehow. But, I don't think that is the issue. I saw the same type of errors you reported when rdma_core is built with the old header file. That definitely will cause problems. The size of the send queue WQEs changed because new fields were added. Then user space and the kernel immediately get off from each other. > > Good luck, About rdma-core, the root cause is clear. I am fine with this patch series. Thanks, Bob. Zhu Yanjun > > Bob > > -----Original Message----- > From: Zhu Yanjun <zyjzyj2000@xxxxxxxxx> > Sent: Tuesday, May 25, 2021 12:18 AM > To: Pearson, Robert B <robert.pearson2@xxxxxxx> > Cc: Pearson, Robert B <rpearsonhpe@xxxxxxxxx>; Jason Gunthorpe > <jgg@xxxxxxxxxx>; RDMA mailing list <linux-rdma@xxxxxxxxxxxxxxx> > Subject: Re: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory > windows > > On Tue, May 25, 2021 at 12:57 PM Pearson, Robert B <robert.pearson2@xxxxxxx> wrote: > > > > Zhu, > > > > I'm not sure about the script. Starting from where you were I copied > > <LINUX>/include/uapi/rdma/rdma_user_rxe.h to > > <RDMA_CORE>/kernel-headers/rdma/rdma_user_rxe.h. After running the > > script you should be able to just diff these two files to make sure > > they are the same. If they aren't copy the header file over. After > > the shift to 5.13 > > rc1+ I re-pulled both trees and applied the kernel patches and then > > rc1+ built everything. The python test cases look like > > > > .............sssssssss.............sssssssssssssssssssssssssssssssss > > ss > > ssssssssssssssssssssssssssssssssssss.ssssssssssssssssssssssssss....s > > ss s.............s.....s.......ssssssssss..ss > > -------------------------------------------------------------------- > > -- > > Ran 182 tests in 0.380s > > Thanks. Please submit a new patch for this problem. > > > > > OK (skipped=124) > > > > There are a lot of skips but no errors. The skips are from features that rxe does not support. > > > > Adding the MW rdma_core patch picks up a small number of additional test cases involving memory windows. > > Thanks a lot. Look forward to these additional test cases involving memory windows. > > Zhu Yanjun > > > > > Regards, > > > > Bob > > > > -----Original Message----- > > From: Zhu Yanjun <zyjzyj2000@xxxxxxxxx> > > Sent: Monday, May 24, 2021 9:09 PM > > To: Pearson, Robert B <rpearsonhpe@xxxxxxxxx> > > Cc: Jason Gunthorpe <jgg@xxxxxxxxxx>; RDMA mailing list > > <linux-rdma@xxxxxxxxxxxxxxx> > > Subject: Re: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory > > windows > > > > On Tue, May 25, 2021 at 12:04 AM Pearson, Robert B <rpearsonhpe@xxxxxxxxx> wrote: > > > > > > On 5/23/2021 10:14 PM, Zhu Yanjun wrote: > > > > On Sat, May 22, 2021 at 4:19 AM Bob Pearson <rpearsonhpe@xxxxxxxxx> wrote: > > > >> This series of patches implement memory windows for the > > > >> rdma_rxe driver. This is a shorter reimplementation of an earlier patch set. > > > >> They apply to and depend on the current for-next linux rdma tree. > > > >> > > > >> Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx> > > > >> --- > > > >> v7: > > > >> Fixed a duplicate INIT_RDMA_OBJ_SIZE(ib_mw, ...) in rxe_verbs.c. > > > > With this patch series, there are about 17 errors and 1 failure in rdma-core. > > > > > > Zhu, > > > > > > You have to sync the kernel-header file with the kernel. > > > > From the link > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/t > > re > > e/Documentation/kbuild/headers_install.rst?h=v5.13-rc3 > > you mean "make headers_install"? > > > > In fact, after "make headers_install", these patches still cause errors and failures in rdma-core. > > > > I will delve into these errors of rdma-core. Too many errors. > > > > Zhu Yanjun > > > > > > > > Bob > > > > > > > " > > > > ---------------------------------------------------------------- > > > > -- > > > > -- > > > > -- > > > > Ran 183 tests in 2.130s > > > > > > > > FAILED (failures=1, errors=17, skipped=124) " > > > > > > > > After these patches, not sure if rxe can communicate with the > > > > physical NICs correctly because of the above errors and failure. > > > > > > > > Zhu Yanjun > > > > > > > >> v6: > > > >> Added rxe_ prefix to subroutine names in lines that changed > > > >> from Zhu's review of v5. > > > >> v5: > > > >> Fixed a typo in 10th patch. > > > >> v4: > > > >> Added a 10th patch to check when MRs have bound MWs > > > >> and disallow dereg and invalidate operations. > > > >> v3: > > > >> cleaned up void return and lower case enums from > > > >> Zhu's review. > > > >> v2: > > > >> cleaned up an issue in rdma_user_rxe.h > > > >> cleaned up a collision in rxe_resp.c > > > >> > > > >> Bob Pearson (9): > > > >> RDMA/rxe: Add bind MW fields to rxe_send_wr > > > >> RDMA/rxe: Return errors for add index and key > > > >> RDMA/rxe: Enable MW object pool > > > >> RDMA/rxe: Add ib_alloc_mw and ib_dealloc_mw verbs > > > >> RDMA/rxe: Replace WR_REG_MASK by WR_LOCAL_OP_MASK > > > >> RDMA/rxe: Move local ops to subroutine > > > >> RDMA/rxe: Add support for bind MW work requests > > > >> RDMA/rxe: Implement invalidate MW operations > > > >> RDMA/rxe: Implement memory access through MWs > > > >> > > > >> drivers/infiniband/sw/rxe/Makefile | 1 + > > > >> drivers/infiniband/sw/rxe/rxe.c | 1 + > > > >> drivers/infiniband/sw/rxe/rxe_comp.c | 1 + > > > >> drivers/infiniband/sw/rxe/rxe_loc.h | 29 +- > > > >> drivers/infiniband/sw/rxe/rxe_mr.c | 79 ++++-- > > > >> drivers/infiniband/sw/rxe/rxe_mw.c | 356 +++++++++++++++++++++++++ > > > >> drivers/infiniband/sw/rxe/rxe_opcode.c | 11 +- > > > >> drivers/infiniband/sw/rxe/rxe_opcode.h | 3 +- > > > >> drivers/infiniband/sw/rxe/rxe_param.h | 19 +- > > > >> drivers/infiniband/sw/rxe/rxe_pool.c | 45 ++-- > > > >> drivers/infiniband/sw/rxe/rxe_pool.h | 8 +- > > > >> drivers/infiniband/sw/rxe/rxe_req.c | 102 ++++--- > > > >> drivers/infiniband/sw/rxe/rxe_resp.c | 110 +++++--- > > > >> drivers/infiniband/sw/rxe/rxe_verbs.c | 5 +- > > > >> drivers/infiniband/sw/rxe/rxe_verbs.h | 38 ++- > > > >> include/uapi/rdma/rdma_user_rxe.h | 34 ++- > > > >> 16 files changed, 691 insertions(+), 151 deletions(-) > > > >> create mode 100644 drivers/infiniband/sw/rxe/rxe_mw.c > > > >> -- > > > >> 2.27.0 > > > >>