On Tue, Jan 18, 2022 at 3:13 PM Eric Huang <jinhuieric.huang@xxxxxxx> wrote: > > I understand Alex's concern. I think usually we only check the version > when a feature is only available in a specific version, and other > version or newer version doesn't have. > > In case of FW fix, we assume the driver and FWs have to be synchronous. > If we have driver backward compatibility for FWs, there must be a lot of > redundant codes for FW version check. So this patch and SDMA fix will be > pushed into ROCm 5.1 release branch at the same time. We need to check. The firmwares are not distributed in lock step with the driver. Alex > > Regards, > Eric > > On 2022-01-18 14:35, Alex Deucher wrote: > > On Tue, Jan 18, 2022 at 2:27 PM Russell, Kent <Kent.Russell@xxxxxxx> wrote: > >> [AMD Official Use Only] > >> > >> I think what he means is that if we are using SDMA v17, this will cause issues, won't it? Should we check that SDMA version is >=18 before enabling it? Or am I misunderstanding the fix? > > Yes, that was my concern. > > > > Alex > > > >> Kent > >> > >>> -----Original Message----- > >>> From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Eric Huang > >>> Sent: Tuesday, January 18, 2022 2:09 PM > >>> To: Alex Deucher <alexdeucher@xxxxxxxxx> > >>> Cc: amd-gfx list <amd-gfx@xxxxxxxxxxxxxxxxxxxxx> > >>> Subject: Re: [PATCH] drm/amdkfd: enable heavy-weight TLB flush on Arcturus > >>> > >>> The SDMA fix is generic and not in a specific version of FW, so we don't > >>> have to check. > >>> > >>> Thanks, > >>> Eric > >>> > >>> On 2022-01-18 11:35, Alex Deucher wrote: > >>>> On Tue, Jan 18, 2022 at 11:16 AM Eric Huang <jinhuieric.huang@xxxxxxx> wrote: > >>>>> SDMA FW fixes the hang issue for adding heavy-weight TLB > >>>>> flush on Arcturus, so we can enable it. > >>>> Do we need to check for a specific fw version? > >>>> > >>>> Alex > >>>> > >>>>> Signed-off-by: Eric Huang <jinhuieric.huang@xxxxxxx> > >>>>> --- > >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 8 +++++--- > >>>>> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 ++- > >>>>> 2 files changed, 7 insertions(+), 4 deletions(-) > >>>>> > >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > >>>>> index a64cbbd943ba..7b24a920c12e 100644 > >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > >>>>> @@ -1892,10 +1892,12 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu( > >>>>> true); > >>>>> ret = unreserve_bo_and_vms(&ctx, false, false); > >>>>> > >>>>> - /* Only apply no TLB flush on Aldebaran to > >>>>> - * workaround regressions on other Asics. > >>>>> + /* Only apply no TLB flush on Aldebaran and Arcturus > >>>>> + * to workaround regressions on other Asics. > >>>>> */ > >>>>> - if (table_freed && (adev->asic_type != CHIP_ALDEBARAN)) > >>>>> + if (table_freed && > >>>>> + (adev->asic_type != CHIP_ALDEBARAN) && > >>>>> + (adev->asic_type != CHIP_ARCTURUS)) > >>>>> *table_freed = true; > >>>>> > >>>>> goto out; > >>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > >>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > >>>>> index b570c0454ce9..ef4d676cc71c 100644 > >>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > >>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > >>>>> @@ -1806,7 +1806,8 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file > >>> *filep, > >>>>> } > >>>>> mutex_unlock(&p->mutex); > >>>>> > >>>>> - if (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2)) { > >>>>> + if (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2) || > >>>>> + KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 1)) { > >>>>> err = amdgpu_amdkfd_gpuvm_sync_memory(dev->adev, > >>>>> (struct kgd_mem *) mem, true); > >>>>> if (err) { > >>>>> -- > >>>>> 2.25.1 > >>>>> >