Re: [PATCH] drm/amdkfd: enable heavy-weight TLB flush on Arcturus

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I understand Alex's concern. I think usually we only check the version when a feature is only available in a specific version, and other version or newer version doesn't have.

In case of FW fix, we assume the driver and FWs have to be synchronous. If we have driver backward compatibility for FWs, there must be a lot of redundant codes for FW version check. So this patch and SDMA fix will be pushed into ROCm 5.1 release branch at the same time.

Regards,
Eric

On 2022-01-18 14:35, Alex Deucher wrote:
On Tue, Jan 18, 2022 at 2:27 PM Russell, Kent <Kent.Russell@xxxxxxx> wrote:
[AMD Official Use Only]

I think what he means is that if we are using SDMA v17, this will cause issues, won't it? Should we check that SDMA version is >=18 before enabling it? Or am I misunderstanding the fix?
Yes, that was my concern.

Alex

  Kent

-----Original Message-----
From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Eric Huang
Sent: Tuesday, January 18, 2022 2:09 PM
To: Alex Deucher <alexdeucher@xxxxxxxxx>
Cc: amd-gfx list <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [PATCH] drm/amdkfd: enable heavy-weight TLB flush on Arcturus

The SDMA fix is generic and not in a specific version of FW, so we don't
have to check.

Thanks,
Eric

On 2022-01-18 11:35, Alex Deucher wrote:
On Tue, Jan 18, 2022 at 11:16 AM Eric Huang <jinhuieric.huang@xxxxxxx> wrote:
SDMA FW fixes the hang issue for adding heavy-weight TLB
flush on Arcturus, so we can enable it.
Do we need to check for a specific fw version?

Alex

Signed-off-by: Eric Huang <jinhuieric.huang@xxxxxxx>
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 8 +++++---
   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c         | 3 ++-
   2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index a64cbbd943ba..7b24a920c12e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1892,10 +1892,12 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
                                  true);
          ret = unreserve_bo_and_vms(&ctx, false, false);

-       /* Only apply no TLB flush on Aldebaran to
-        * workaround regressions on other Asics.
+       /* Only apply no TLB flush on Aldebaran and Arcturus
+        * to workaround regressions on other Asics.
           */
-       if (table_freed && (adev->asic_type != CHIP_ALDEBARAN))
+       if (table_freed &&
+           (adev->asic_type != CHIP_ALDEBARAN) &&
+           (adev->asic_type != CHIP_ARCTURUS))
                  *table_freed = true;

          goto out;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index b570c0454ce9..ef4d676cc71c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1806,7 +1806,8 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file
*filep,
          }
          mutex_unlock(&p->mutex);

-       if (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2)) {
+       if (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2) ||
+           KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 1)) {
                  err = amdgpu_amdkfd_gpuvm_sync_memory(dev->adev,
                                  (struct kgd_mem *) mem, true);
                  if (err) {
--
2.25.1





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux