Re: [PATCH 2/2] drm/amdkfd: change svm range evict

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2022-06-29 18:53, Eric Huang wrote:


On 2022-06-29 18:20, Felix Kuehling wrote:
On 2022-06-28 17:43, Eric Huang wrote:
Two changes:
1. reducing unnecessary evict/unmap when range is not mapped to gpu.
2. adding always evict when flags is set to always_mapped.

Signed-off-by: Eric Huang <jinhuieric.huang@xxxxxxx>
---
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 ++++++++--
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 4bf2f75f853b..76e817687ef9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1767,12 +1767,16 @@ svm_range_evict(struct svm_range *prange, struct mm_struct *mm,
      struct kfd_process *p;
      int r = 0;
  +    if (!prange->mapped_to_gpu)
+        return 0;

This feels like an unrelated optimization that should be in a separate patch.

But I'm not sure this is correct, because it doesn't consider child ranges. svm_range_unmap_from_gpus already contains this check, so ranges should not be unmapped unnecessarily either way. Is there any other benefit to this change that I'm missing?
I will send another patch separately that considers child ranges.

I think this should only be done in the XNACK-off case. For XNACK-on it's already handled in the svm_range_unmap_from_gpus.


The benefit is it will reduce unnecessary queue evicts when allocating context save memory, which is unmapped to gpu.

That sounds wrong. The context save area should never be unmapped from GPU. That's the whole point of setting the KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED flag. I guess this is happening while migrating the context save area to VRAM for the first time, even before it's mapped to GPU?

I think there may be another eviction, when the CWSR header is initialized by the CPU. That would also migrate it back to system memory. To avoid that, you should probably register the context save area only after the header has been initialized.

I think avoiding an eviction when memory is migrated when it is first registered is worthwhile, not just for CWSR.


It is for efficiency reason. On the other hand, without this optimization KFDCWSRTest.InterruptRestore fails with queue preemption error.

What do you mean by "queue preemption error"? Does HWS hang?


I think the reason is the extra queue evicts make HWS too busy to preempt existing queues. There is one unmap_queue packet sent to HWS in current code, and will be three unmap_queue packets with unified memory allocation.

When queues of a process are already evicted, they should not get evicted again. That's handled by the qpd->evicted counter. There should never be multiple unmap_queues packets in flight at the same time. If you're seeing three unmap_queues, you should also see queues restored three times.

HWS should never be too busy to evict queues. If you're seeing preemptions fail, you may have found a bug.

Regards,
  Felix


So this optimization will keep only one unmap_queue as before.

Regards,
Eric

Regards,
  Felix


+
      p = container_of(svms, struct kfd_process, svms);
        pr_debug("invalidate svms 0x%p prange [0x%lx 0x%lx] [0x%lx 0x%lx]\n",
           svms, prange->start, prange->last, start, last);
  -    if (!p->xnack_enabled) {
+    if (!p->xnack_enabled ||
+        (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) {
          int evicted_ranges;
            list_for_each_entry(pchild, &prange->child_list, child_list) { @@ -3321,7 +3325,9 @@ svm_range_set_attr(struct kfd_process *p, struct mm_struct *mm,
          if (r)
              goto out_unlock_range;
  -        if (migrated && !p->xnack_enabled) {
+        if (migrated && (!p->xnack_enabled ||
+            (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
+            prange->mapped_to_gpu) {
              pr_debug("restore_work will update mappings of GPUs\n");
              mutex_unlock(&prange->migrate_mutex);
              continue;




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux