Re: [PATCH for-rc v4 0/5] RDMA/rxe: Various bug fixes.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Bob
With this patch serious, the blktests nvme-rdma still can be failed
with the below error. and the test can be pass with siw.

[ 1702.140090] loop0: detected capacity change from 0 to 2097152
[ 1702.150729] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 1702.151425] nvmet_rdma: enabling port 0 (10.16.221.116:4420)
[ 1702.158810] nvmet: creating controller 1 for subsystem
blktests-subsystem-1 for NQN
nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0035-4b10-8044-b9c04f463333.
[ 1702.159037] nvme nvme0: creating 32 I/O queues.
[ 1702.171671] nvme nvme0: failed to initialize MR pool sized 128 for QID 32
[ 1702.178482] nvme nvme0: rdma connection establishment failed (-12)
[ 1702.292261] eno2 speed is unknown, defaulting to 1000
[ 1702.297325] eno3 speed is unknown, defaulting to 1000
[ 1702.302389] eno4 speed is unknown, defaulting to 1000
[ 1702.317991] rdma_rxe: unloaded

Failure from:
        /*
         * Currently we don't use SG_GAPS MR's so if the first entry is
         * misaligned we'll end up using two entries for a single data page,
         * so one additional entry is required.
         */
        pages_per_mr = nvme_rdma_get_max_fr_pages(ibdev, queue->pi_support) + 1;
        ret = ib_mr_pool_init(queue->qp, &queue->qp->rdma_mrs,
                              queue->queue_size,
                              IB_MR_TYPE_MEM_REG,
                              pages_per_mr, 0);
        if (ret) {
                dev_err(queue->ctrl->ctrl.device,
                        "failed to initialize MR pool sized %d for QID %d\n",
                        queue->queue_size, nvme_rdma_queue_idx(queue));
                goto out_destroy_ring;
        }


On Wed, Sep 15, 2021 at 12:43 AM Bob Pearson <rpearsonhpe@xxxxxxxxx> wrote:
>
> This series of patches implements several bug fixes and minor
> cleanups of the rxe driver. Specifically these fix a bug exposed
> by blktest.
>
> They apply cleanly to both
> commit 1b789bd4dbd48a92f5427d9c37a72a8f6ca17754 (origin/for-rc)
> commit 6a217437f9f5482a3f6f2dc5fcd27cf0f62409ac (origin/for-next)
>
> The first patch is a rewrite of an earlier patch.
> It adds memory barriers to kernel to kernel queues. The logic for this
> is the same as an earlier patch that only treated user to kernel queues.
> Without this patch kernel to kernel queues are expected to intermittently
> fail at low frequency as was seen for the other queues.
>
> The second patch cleans up the state and type enums used by MRs.
>
> The third patch separates the keys in rxe_mr and ib_mr. This allows
> the following sequence seen in the srp driver to work correctly.
>
>         do {
>                 ib_post_send( IB_WR_LOCAL_INV )
>                 ib_update_fast_reg_key()
>                 ib_map_mr_sg()
>                 ib_post_send( IB_WR_REG_MR )
>         } while ( !done )
>
> The fourth patch creates duplicate mapping tables for fast MRs. This
> prevents rkeys referencing fast MRs from accessing data from an updated
> map after the call to ib_map_mr_sg() call by keeping the new and old
> mappings separate and atomically swapping them when a reg mr WR is
> executed.
>
> The fifth patch checks the type of MRs which receive local or remote
> invalidate operations to prevent invalidating user MRs.
>
> v3->v4:
> Two of the patches in v3 were accepted in v5.15 so have been dropped
> here.
>
> The first patch was rewritten to correctly deal with queue operations
> in rxe_verbs.c where the code is the client and not the server.
>
> v2->v3:
> The v2 version had a typo which broke clean application to for-next.
> Additionally in v3 the order of the patches was changed to make
> it a little cleaner.
>
> Bob Pearson (5):
>   RDMA/rxe: Add memory barriers to kernel queues
>   RDMA/rxe: Cleanup MR status and type enums
>   RDMA/rxe: Separate HW and SW l/rkeys
>   RDMA/rxe: Create duplicate mapping tables for FMRs
>   RDMA/rxe: Only allow invalidate for appropriate MRs
>
>  drivers/infiniband/sw/rxe/rxe_comp.c  |  12 +-
>  drivers/infiniband/sw/rxe/rxe_cq.c    |  25 +--
>  drivers/infiniband/sw/rxe/rxe_loc.h   |   2 +
>  drivers/infiniband/sw/rxe/rxe_mr.c    | 267 ++++++++++++++++-------
>  drivers/infiniband/sw/rxe/rxe_mw.c    |  36 ++--
>  drivers/infiniband/sw/rxe/rxe_qp.c    |  12 +-
>  drivers/infiniband/sw/rxe/rxe_queue.c |  30 ++-
>  drivers/infiniband/sw/rxe/rxe_queue.h | 292 +++++++++++---------------
>  drivers/infiniband/sw/rxe/rxe_req.c   |  51 ++---
>  drivers/infiniband/sw/rxe/rxe_resp.c  |  40 +---
>  drivers/infiniband/sw/rxe/rxe_srq.c   |   2 +-
>  drivers/infiniband/sw/rxe/rxe_verbs.c |  92 ++------
>  drivers/infiniband/sw/rxe/rxe_verbs.h |  48 ++---
>  13 files changed, 438 insertions(+), 471 deletions(-)
>
> --
> 2.30.2
>


-- 
Best Regards,
  Yi Zhang




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux