Re: [PATCH] drm/amdkfd: Fix a bug in SDMA RLC queue counting under HWS mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



True. It is a bug too. I am looking into it.

Yong

On 2020-01-30 5:51 p.m., Felix Kuehling wrote:
On 2020-01-30 17:29, Yong Zhao wrote:
The sdma_queue_count increment should be done before
execute_queues_cpsch(), which calls pm_calc_rlib_size() where
sdma_queue_count is used to calculate whether over_subscription is
triggered.

With the previous code, when a SDMA queue is created,
compute_queue_count in pm_calc_rlib_size() is one more than the
actual compute queue number, because the queue_count has been
incremented while sdma_queue_count has not. This patch fixes that.

Change-Id: I20353e657efd505353d0dd9f7eb2fab5085e7202
Signed-off-by: Yong Zhao <Yong.Zhao@xxxxxxx>

Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>

But I took a look at pm_calc_rlib_size. I don't think subtracting dqm->sdma_queue_count from dqm->queue_count is not quite correct, because sdma_queue_count counts all SDMA queues, while queue_count only counts active queues. So an application that creates inactive SDMA queues will also create errors here. We probably need to count active compute and active SDMA queues separately in DQM to fix this properly.

Regards,
  Felix


---
  drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 10 ++++++----
  1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 2870553a2ce0..80d22bf702e8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1237,16 +1237,18 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
        list_add(&q->list, &qpd->queues_list);
      qpd->queue_count++;
+
+    if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
+        dqm->sdma_queue_count++;
+    else if (q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI)
+        dqm->xgmi_sdma_queue_count++;
+
      if (q->properties.is_active) {
          dqm->queue_count++;
          retval = execute_queues_cpsch(dqm,
                  KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
      }
  -    if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
-        dqm->sdma_queue_count++;
-    else if (q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI)
-        dqm->xgmi_sdma_queue_count++;
      /*
       * Unconditionally increment this counter, regardless of the queue's
       * type or whether the queue is active.
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux