Re: [PATCH 2/2] drm/amdkfd: change svm range evict

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2022-06-29 18:20, Felix Kuehling wrote:
On 2022-06-28 17:43, Eric Huang wrote:
Two changes:
1. reducing unnecessary evict/unmap when range is not mapped to gpu.
2. adding always evict when flags is set to always_mapped.

Signed-off-by: Eric Huang <jinhuieric.huang@xxxxxxx>
---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 ++++++++--
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 4bf2f75f853b..76e817687ef9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1767,12 +1767,16 @@ svm_range_evict(struct svm_range *prange, struct mm_struct *mm,
      struct kfd_process *p;
      int r = 0;
  +    if (!prange->mapped_to_gpu)
+        return 0;

This feels like an unrelated optimization that should be in a separate patch.

But I'm not sure this is correct, because it doesn't consider child ranges. svm_range_unmap_from_gpus already contains this check, so ranges should not be unmapped unnecessarily either way. Is there any other benefit to this change that I'm missing?
I will send another patch separately that considers child ranges. The benefit is it will reduce unnecessary queue evicts when allocating context save memory, which is unmapped to gpu. It is for efficiency reason. On the other hand, without this optimization KFDCWSRTest.InterruptRestore fails with queue preemption error. I think the reason is the extra queue evicts make HWS too busy to preempt existing queues. There is one unmap_queue packet sent to HWS in current code, and will be three unmap_queue packets with unified memory allocation. So this optimization will keep only one unmap_queue as before.

Regards,
Eric

Regards,
  Felix


+
      p = container_of(svms, struct kfd_process, svms);
        pr_debug("invalidate svms 0x%p prange [0x%lx 0x%lx] [0x%lx 0x%lx]\n",
           svms, prange->start, prange->last, start, last);
  -    if (!p->xnack_enabled) {
+    if (!p->xnack_enabled ||
+        (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) {
          int evicted_ranges;
            list_for_each_entry(pchild, &prange->child_list, child_list) { @@ -3321,7 +3325,9 @@ svm_range_set_attr(struct kfd_process *p, struct mm_struct *mm,
          if (r)
              goto out_unlock_range;
  -        if (migrated && !p->xnack_enabled) {
+        if (migrated && (!p->xnack_enabled ||
+            (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
+            prange->mapped_to_gpu) {
              pr_debug("restore_work will update mappings of GPUs\n");
              mutex_unlock(&prange->migrate_mutex);
              continue;




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux