Hi Felix,
After sync w/ James, we agree that this patch series could fix both our problems, and he verified this patch series will not cause regression of his previous issue. Do you have more comments regarding this patch series ? Thanks.
BRs,
Yifan
From: Zhu, James <James.Zhu@xxxxxxx>
Sent: Wednesday, September 29, 2021 9:19 PM
To: Kuehling, Felix <Felix.Kuehling@xxxxxxx>; Zhang, Yifan <Yifan1.Zhang@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init
[AMD Official Use Only]
Since the previous patch can help on PCO suspend/resume hung issue. Let me work with YiFan to see if
there is proper way to cover both cases.
Thanks & Best Regards!
James Zhu
[+James]
This basically undoes James's change "drm/amdgpu: move iommu_resume
before ip init/resume". I assume James made his change for a reason. Can
you please discuss the issue with him and determine a solution that
solves both your problem and his?
If James' patch series was a mistake, I'd prefer to revert his patches,
because his patches complicated the initialization sequence and exposed
the iommu init sequence in amdgpu.
Thanks,
Felix
Am 2021-09-28 um 4:28 a.m. schrieb Yifan Zhang:
> This patch is to fix clinfo failure in Raven/Picasso:
>
> Number of platforms: 1
> Platform Profile: FULL_PROFILE
> Platform Version: OpenCL 2.2 AMD-APP (3364.0)
> Platform Name: AMD Accelerated Parallel Processing
> Platform Vendor: Advanced Micro Devices, Inc.
> Platform Extensions: cl_khr_icd cl_amd_event_callback
>
> Platform Name: AMD Accelerated Parallel Processing Number of devices: 0
>
> Signed-off-by: Yifan Zhang <yifan1.zhang@xxxxxxx>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 4c8f2f4647c0..89ed9b091386 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2393,10 +2393,6 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
> if (r)
> goto init_failed;
>
> - r = amdgpu_amdkfd_resume_iommu(adev);
> - if (r)
> - goto init_failed;
> -
> r = amdgpu_device_ip_hw_init_phase1(adev);
> if (r)
> goto init_failed;
> @@ -2435,6 +2431,10 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
> if (!adev->gmc.xgmi.pending_reset)
> amdgpu_amdkfd_device_init(adev);
>
> + r = amdgpu_amdkfd_resume_iommu(adev);
> + if (r)
> + goto init_failed;
> +
> amdgpu_fru_get_product_info(adev);
>
> init_failed:
|