Re: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory windows

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 25, 2021 at 1:27 PM Pearson, Robert B
<robert.pearson2@xxxxxxx> wrote:
>
> There's nothing to change. There is no problem. Just get the headers sync'ed.

I delved into the errors. I found that the following would fix these
errors in rdma-core.

diff --git a/kernel-headers/rdma/rdma_user_rxe.h
b/kernel-headers/rdma/rdma_user_rxe.h
index 068433e2..90ea477f 100644
--- a/kernel-headers/rdma/rdma_user_rxe.h
+++ b/kernel-headers/rdma/rdma_user_rxe.h
@@ -99,7 +99,17 @@ struct rxe_send_wr {
                        __u32   remote_qkey;
                        __u16   pkey_index;
                } ud;
+               struct {
+                       __aligned_u64   addr;
+                       __aligned_u64   length;
+                       __u32           mr_lkey;
+                       __u32           mw_rkey;
+                       __u32           rkey;
+                       __u32           access;
+                       __u32           flags;
+               } mw;
                /* reg is only used by the kernel and is not part of the uapi */
+#ifdef __KERNEL__
                struct {
                        union {
                                struct ib_mr *mr;
@@ -108,6 +118,7 @@ struct rxe_send_wr {
                        __u32        key;
                        __u32        access;
                } reg;
+#endif
        } wr;
 };

Zhu Yanjun

> If that doesn't fix your issues your tree has gotten corrupted somehow. But, I don't think that is the issue. I saw the same type of errors you reported when rdma_core is built with the old header file. That definitely will cause problems. The size of the send queue WQEs changed because new fields were added. Then user space and the kernel immediately get off from each other.
>
> Good luck,
>
> Bob
>
> -----Original Message-----
> From: Zhu Yanjun <zyjzyj2000@xxxxxxxxx>
> Sent: Tuesday, May 25, 2021 12:18 AM
> To: Pearson, Robert B <robert.pearson2@xxxxxxx>
> Cc: Pearson, Robert B <rpearsonhpe@xxxxxxxxx>; Jason Gunthorpe <jgg@xxxxxxxxxx>; RDMA mailing list <linux-rdma@xxxxxxxxxxxxxxx>
> Subject: Re: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory windows
>
> On Tue, May 25, 2021 at 12:57 PM Pearson, Robert B <robert.pearson2@xxxxxxx> wrote:
> >
> > Zhu,
> >
> > I'm not sure about the script. Starting from where you were I copied
> > <LINUX>/include/uapi/rdma/rdma_user_rxe.h to
> > <RDMA_CORE>/kernel-headers/rdma/rdma_user_rxe.h. After running the
> > script you should be able to just diff these two files to make sure
> > they are the same. If they aren't copy the header file over. After the
> > shift to 5.13
> > rc1+ I re-pulled both trees and applied the kernel patches and then
> > rc1+ built everything. The python test cases look like
> >
> > .............sssssssss.............sssssssssssssssssssssssssssssssssss
> > ssssssssssssssssssssssssssssssssssss.ssssssssssssssssssssssssss....sss
> > s.............s.....s.......ssssssssss..ss
> > ----------------------------------------------------------------------
> > Ran 182 tests in 0.380s
>
> Thanks. Please submit a new patch for this problem.
>
> >
> > OK (skipped=124)
> >
> > There are a lot of skips but no errors. The skips are from features that rxe does not support.
> >
> > Adding the MW rdma_core patch picks up a small number of additional test cases involving memory windows.
>
> Thanks a lot. Look forward to these additional test cases involving memory windows.
>
> Zhu Yanjun
>
> >
> > Regards,
> >
> > Bob
> >
> > -----Original Message-----
> > From: Zhu Yanjun <zyjzyj2000@xxxxxxxxx>
> > Sent: Monday, May 24, 2021 9:09 PM
> > To: Pearson, Robert B <rpearsonhpe@xxxxxxxxx>
> > Cc: Jason Gunthorpe <jgg@xxxxxxxxxx>; RDMA mailing list
> > <linux-rdma@xxxxxxxxxxxxxxx>
> > Subject: Re: [PATCH for-next v7 00/10] RDMA/rxe: Implement memory
> > windows
> >
> > On Tue, May 25, 2021 at 12:04 AM Pearson, Robert B <rpearsonhpe@xxxxxxxxx> wrote:
> > >
> > > On 5/23/2021 10:14 PM, Zhu Yanjun wrote:
> > > > On Sat, May 22, 2021 at 4:19 AM Bob Pearson <rpearsonhpe@xxxxxxxxx> wrote:
> > > >> This series of patches implement memory windows for the rdma_rxe
> > > >> driver. This is a shorter reimplementation of an earlier patch set.
> > > >> They apply to and depend on the current for-next linux rdma tree.
> > > >>
> > > >> Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx>
> > > >> ---
> > > >> v7:
> > > >>    Fixed a duplicate INIT_RDMA_OBJ_SIZE(ib_mw, ...) in rxe_verbs.c.
> > > > With this patch series, there are about 17 errors and 1 failure in rdma-core.
> > >
> > > Zhu,
> > >
> > > You have to sync the kernel-header file with the kernel.
> >
> > From the link
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tre
> > e/Documentation/kbuild/headers_install.rst?h=v5.13-rc3
> > you mean "make headers_install"?
> >
> > In fact, after "make headers_install", these patches still cause errors and failures in rdma-core.
> >
> > I will delve into these errors of rdma-core. Too many errors.
> >
> > Zhu Yanjun
> >
> > >
> > > Bob
> > >
> > > > "
> > > > ------------------------------------------------------------------
> > > > --
> > > > --
> > > > Ran 183 tests in 2.130s
> > > >
> > > > FAILED (failures=1, errors=17, skipped=124) "
> > > >
> > > > After these patches, not sure if rxe can communicate with the
> > > > physical NICs correctly because of the above errors and failure.
> > > >
> > > > Zhu Yanjun
> > > >
> > > >> v6:
> > > >>    Added rxe_ prefix to subroutine names in lines that changed
> > > >>    from Zhu's review of v5.
> > > >> v5:
> > > >>    Fixed a typo in 10th patch.
> > > >> v4:
> > > >>    Added a 10th patch to check when MRs have bound MWs
> > > >>    and disallow dereg and invalidate operations.
> > > >> v3:
> > > >>    cleaned up void return and lower case enums from
> > > >>    Zhu's review.
> > > >> v2:
> > > >>    cleaned up an issue in rdma_user_rxe.h
> > > >>    cleaned up a collision in rxe_resp.c
> > > >>
> > > >> Bob Pearson (9):
> > > >>    RDMA/rxe: Add bind MW fields to rxe_send_wr
> > > >>    RDMA/rxe: Return errors for add index and key
> > > >>    RDMA/rxe: Enable MW object pool
> > > >>    RDMA/rxe: Add ib_alloc_mw and ib_dealloc_mw verbs
> > > >>    RDMA/rxe: Replace WR_REG_MASK by WR_LOCAL_OP_MASK
> > > >>    RDMA/rxe: Move local ops to subroutine
> > > >>    RDMA/rxe: Add support for bind MW work requests
> > > >>    RDMA/rxe: Implement invalidate MW operations
> > > >>    RDMA/rxe: Implement memory access through MWs
> > > >>
> > > >>   drivers/infiniband/sw/rxe/Makefile     |   1 +
> > > >>   drivers/infiniband/sw/rxe/rxe.c        |   1 +
> > > >>   drivers/infiniband/sw/rxe/rxe_comp.c   |   1 +
> > > >>   drivers/infiniband/sw/rxe/rxe_loc.h    |  29 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_mr.c     |  79 ++++--
> > > >>   drivers/infiniband/sw/rxe/rxe_mw.c     | 356 +++++++++++++++++++++++++
> > > >>   drivers/infiniband/sw/rxe/rxe_opcode.c |  11 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_opcode.h |   3 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_param.h  |  19 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_pool.c   |  45 ++--
> > > >>   drivers/infiniband/sw/rxe/rxe_pool.h   |   8 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_req.c    | 102 ++++---
> > > >>   drivers/infiniband/sw/rxe/rxe_resp.c   | 110 +++++---
> > > >>   drivers/infiniband/sw/rxe/rxe_verbs.c  |   5 +-
> > > >>   drivers/infiniband/sw/rxe/rxe_verbs.h  |  38 ++-
> > > >>   include/uapi/rdma/rdma_user_rxe.h      |  34 ++-
> > > >>   16 files changed, 691 insertions(+), 151 deletions(-)
> > > >>   create mode 100644 drivers/infiniband/sw/rxe/rxe_mw.c
> > > >> --
> > > >> 2.27.0
> > > >>




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux