Re: Unexpected issues with 2 NVME initiators using the same target

Sagi Grimberg <sagi@xxxxxxxxxxx> · Mon, 6 Mar 2017 13:28:56 +0200

Hi Sagi,

I think we need to add fence to the UMR wqe.

so lets try this one:

diff --git a/drivers/infiniband/hw/mlx5/qp.c
b/drivers/infiniband/hw/mlx5/qp.c
index ad8a263..c38c4fa 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -3737,8 +3737,7 @@ static void dump_wqe(struct mlx5_ib_qp *qp, int
idx, int size_16)

 static u8 get_fence(u8 fence, struct ib_send_wr *wr)
 {
-       if (unlikely(wr->opcode == IB_WR_LOCAL_INV &&
-                    wr->send_flags & IB_SEND_FENCE))
+       if (wr->opcode == IB_WR_LOCAL_INV || wr->opcode == IB_WR_REG_MR)
                return MLX5_FENCE_MODE_STRONG_ORDERING;

        if (unlikely(fence)) {

This will kill performance, isn't there another fix that can
be applied just for retransmission flow?

Couldn't repro that case but I run some initial tests in my Lab (with my
patch above) - not performace servers:

Initiator with 24 CPUs (2 threads/core, 6 cores/socket, 2 sockets),
Connect IB (same driver mlx5_ib), kernel 4.10.0, fio test with 24 jobs
and 128 iodepth.
register_always=N

Target - 1 subsystem with 1 ns (null_blk)

bs   read (without/with patch)   write (without/with patch)
--- --------------------------  ---------------------------
512     1019k / 1008k                 1004k / 992k
1k      1021k / 1013k                 1002k / 991k
4k      1030k / 1022k                 978k  / 969k

CPU usage is 100% for both cases in the initiator side.
haven't seen difference with bs = 16k.
No so big drop like we would expect,

Obviously you won't see a drop without registering memory
for small IO (register_always=N), this would bypass registration
altogether... Please retest with register_always=Y.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html