Re: [PATCH 19/20] RDMA/mlx5: Use PA mapping for PI handover

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 6/6/2019 1:59 AM, Sagi Grimberg wrote:


On 5/30/19 6:25 AM, Max Gurtovoy wrote:
If possibe, avoid doing a UMR operation to register data and protection
buffers (via MTT/KLM mkeys). Instead, use the local DMA key and map the
SG lists using PA access. This is safe, since the internal key for data
and protection never exposed to the remote server (only signature key
might be exposed). If PA mappings are not possible, perform mapping
using MTT/KLM descriptors.

The setup of the tested benchmark (using iSER ULP):
  - 2 servers with 24 cores (1 initiator and 1 target)
  - ConnectX-4/ConnectX-5 adapters
  - 24 target sessions with 1 LUN each
  - ramdisk backstore
  - PI active

Performance results running fio (24 jobs, 128 iodepth) using
write_generate=1 and read_verify=1 (w/w.o patch):

bs      IOPS(read)        IOPS(write)
----    ----------        ----------
512   1266.4K/1262.4K    1720.1K/1732.1K
4k    793139/570902      1129.6K/773982
32k   72660/72086        97229/96164

Using write_generate=0 and read_verify=0 (w/w.o patch):
bs      IOPS(read)        IOPS(write)
----    ----------        ----------
512   1590.2K/1600.1K    1828.2K/1830.3K
4k    1078.1K/937272     1142.1K/815304
32k   77012/77369        98125/97435

Signed-off-by: Max Gurtovoy <maxg@xxxxxxxxxxxx>
Signed-off-by: Israel Rukshin <israelr@xxxxxxxxxxxx>
Suggested-by: Sagi Grimberg <sagi@xxxxxxxxxxx>
---
  drivers/infiniband/hw/mlx5/mlx5_ib.h |  1 +
  drivers/infiniband/hw/mlx5/mr.c      | 63 ++++++++++++++++++++++++++--
  drivers/infiniband/hw/mlx5/qp.c      | 80 ++++++++++++++++++++++++------------
  3 files changed, 114 insertions(+), 30 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 6039a1fc80a1..97c8534c5802 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -609,6 +609,7 @@ struct mlx5_ib_mr {
      struct mlx5_ib_mr      *pi_mr;
      struct mlx5_ib_mr      *klm_mr;
      struct mlx5_ib_mr      *mtt_mr;
+    u64            data_iova;
      u64            pi_iova;
        atomic_t        num_leaf_free;
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 74cec8af158a..9025b477d065 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -2001,6 +2001,40 @@ int mlx5_ib_check_mr_status(struct ib_mr *ibmr, u32 check_mask,
      return ret;
  }
  +static int
+mlx5_ib_map_pa_mr_sg_pi(struct ib_mr *ibmr, struct scatterlist *data_sg,
+            int data_sg_nents, unsigned int *data_sg_offset,
+            struct scatterlist *meta_sg, int meta_sg_nents,
+            unsigned int *meta_sg_offset)
+{
+    struct mlx5_ib_mr *mr = to_mmr(ibmr);
+    unsigned int sg_offset = 0;
+    int n = 0;
+
+    mr->meta_length = 0;
+    if (data_sg_nents == 1) {
+        n++;
+        mr->ndescs = 1;
+        if (data_sg_offset)
+            sg_offset = *data_sg_offset;
+        mr->data_length = sg_dma_len(data_sg) - sg_offset;
+        mr->data_iova = sg_dma_address(data_sg) + sg_offset;
+        if (meta_sg_nents == 1) {
+            n++;
+            mr->meta_ndescs = 1;
+            if (meta_sg_offset)
+                sg_offset = *meta_sg_offset;
+            else
+                sg_offset = 0;
+            mr->meta_length = sg_dma_len(meta_sg) - sg_offset;
+            mr->pi_iova = sg_dma_address(meta_sg) + sg_offset;
+        }
+        ibmr->length = mr->data_length + mr->meta_length;

If I'm reading this correctly, this is assuming that if data_sg_nents is
1 then meta_sg_nents is either 1 or 0.

Is that really always the case?
No. I've the a counter for returning the num of mapped elements.

What if my I/O was merged and my data pages happen to coalesce (because
they are contiguous) but my meta buffers did not?

fallback to mtt.

We use PA mapping iff data_nents == 1 and meta_nents == 1/0




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux