Re: [PATCH] vfio/mlx5: Align the page tracking max message size with the device capability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/12/2024 17:07, Cédric Le Goater wrote:
Hello Yishai,

On 11/25/24 12:32, Yishai Hadas wrote:
Align the page tracking maximum message size with the device's
capability instead of relying on PAGE_SIZE.

This adjustment resolves a mismatch on systems where PAGE_SIZE is 64K,
but the firmware only supports a maximum message size of 4K.

Now that we rely on the device's capability for max_message_size, we
must account for potential future increases in its value.

Key considerations include:
- Supporting message sizes that exceed a single system page (e.g., an 8K
   message on a 4K system).
- Ensuring the RQ size is adjusted to accommodate at least 4
   WQEs/messages, in line with the device specification.

Perhaps theses changes deserve two separate patches ?

I had considered that as well at some point.

However, once we transitioned to use the firmware maximum message size capability, the code needed to be prepared for any future change to that value. Failing to do so, could result in a broken patch.

Since this is a Fixes patch, I believe it’s better to provide a single functional patch rather than splitting it into two.



The above has been addressed as part of the patch.

Fixes: 79c3cf279926 ("vfio/mlx5: Init QP based resources for dirty tracking")
Signed-off-by: Yishai Hadas <yishaih@xxxxxxxxxx>
---
  drivers/vfio/pci/mlx5/cmd.c | 47 +++++++++++++++++++++++++++----------
  1 file changed, 35 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c
index 7527e277c898..a61d303d9b6a 100644
--- a/drivers/vfio/pci/mlx5/cmd.c
+++ b/drivers/vfio/pci/mlx5/cmd.c
@@ -1517,7 +1517,8 @@ int mlx5vf_start_page_tracker(struct vfio_device *vdev,
      struct mlx5_vhca_qp *host_qp;
      struct mlx5_vhca_qp *fw_qp;
      struct mlx5_core_dev *mdev;
-    u32 max_msg_size = PAGE_SIZE;
+    u32 log_max_msg_size;
+    u32 max_msg_size;
      u64 rq_size = SZ_2M;
      u32 max_recv_wr;
      int err;
@@ -1534,6 +1535,12 @@ int mlx5vf_start_page_tracker(struct vfio_device *vdev,
      }
      mdev = mvdev->mdev;
+    log_max_msg_size = MLX5_CAP_ADV_VIRTUALIZATION(mdev, pg_track_log_max_msg_size);
+    max_msg_size = (1ULL << log_max_msg_size);
+    /* The RQ must hold at least 4 WQEs/messages for successful QP creation */
+    if (rq_size < 4 * max_msg_size)
+        rq_size = 4 * max_msg_size;
+
      memset(tracker, 0, sizeof(*tracker));
      tracker->uar = mlx5_get_uars_page(mdev);
      if (IS_ERR(tracker->uar)) {
@@ -1623,25 +1630,41 @@ set_report_output(u32 size, int index, struct mlx5_vhca_qp *qp,
  {
      u32 entry_size = MLX5_ST_SZ_BYTES(page_track_report_entry);
      u32 nent = size / entry_size;
+    u32 nent_in_page;
+    u32 nent_to_set;
      struct page *page;
+    void *page_start;

A variable name of 'kaddr' would reflect better that this is a mapping.

OK, I'll rename as part of V1.


+    u32 page_offset;
+    u32 page_index;
+    u32 buf_offset;

I would move these declarations below under the 'do {} while' loop

We could consider moving most of the variables inside the do { } while block. However, since the majority of the function's body is already within the do { } while, it seems reasonable and cleaner to declare all the variables together at the start of the function.

A part from that, it looks good.

Thanks for your review.


I haven't seen any issues on x86 and I have asked QE to test with
a 64k kernel on ARM

Could you please update once the test is completed successfully on the 64K system ?

After that, I'll send out V1 and include the tested-by clause with the name you'll provide.

Thanks,
Yishai


Thanks,
C.


      u64 addr;
      u64 *buf;
      int i;
-    if (WARN_ON(index >= qp->recv_buf.npages ||
+    buf_offset = index * qp->max_msg_size;
+    if (WARN_ON(buf_offset + size >= qp->recv_buf.npages * PAGE_SIZE ||
              (nent > qp->max_msg_size / entry_size)))
          return;
-    page = qp->recv_buf.page_list[index];
-    buf = kmap_local_page(page);
-    for (i = 0; i < nent; i++) {
-        addr = MLX5_GET(page_track_report_entry, buf + i,
-                dirty_address_low);
-        addr |= (u64)MLX5_GET(page_track_report_entry, buf + i,
-                      dirty_address_high) << 32;
-        iova_bitmap_set(dirty, addr, qp->tracked_page_size);
-    }
-    kunmap_local(buf);
+    do {
+        page_index = buf_offset / PAGE_SIZE;
+        page_offset = buf_offset % PAGE_SIZE;
+        nent_in_page = (PAGE_SIZE - page_offset) / entry_size;
+        page = qp->recv_buf.page_list[page_index];
+        page_start = kmap_local_page(page);
+        buf = page_start + page_offset;
+        nent_to_set = min(nent, nent_in_page);
+        for (i = 0; i < nent_to_set; i++) {
+            addr = MLX5_GET(page_track_report_entry, buf + i,
+                    dirty_address_low);
+            addr |= (u64)MLX5_GET(page_track_report_entry, buf + i,
+                          dirty_address_high) << 32;
+            iova_bitmap_set(dirty, addr, qp->tracked_page_size);
+        }
+        kunmap_local(page_start);
+        buf_offset += (nent_to_set * entry_size);
+        nent -= nent_to_set;
+    } while (nent);
  }
  static void






[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux