RE: [PATCH 2/2] drm/amdgpu: Enable per-queue reset support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - AMD Internal Distribution Only]

-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
Sent: Friday, February 14, 2025 2:54 PM
To: Zhang, Jesse(Jie) <Jesse.Zhang@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Kim, Jonathan <Jonathan.Kim@xxxxxxx>; Zhu, Jiadong <Jiadong.Zhu@xxxxxxx>; Prosyak, Vitaly <Vitaly.Prosyak@xxxxxxx>
Subject: Re: [PATCH 2/2] drm/amdgpu: Enable per-queue reset support



On 2/14/2025 12:14 PM, Zhang, Jesse(Jie) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Lijo,
> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
> Sent: Friday, February 14, 2025 2:10 PM
> To: Zhang, Jesse(Jie) <Jesse.Zhang@xxxxxxx>;
> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Kim, Jonathan
> <Jonathan.Kim@xxxxxxx>; Zhu, Jiadong <Jiadong.Zhu@xxxxxxx>; Prosyak,
> Vitaly <Vitaly.Prosyak@xxxxxxx>
> Subject: Re: [PATCH 2/2] drm/amdgpu: Enable per-queue reset support
>
>
>
> On 2/14/2025 11:25 AM, jesse.zhang@xxxxxxx wrote:
>> From: "Jesse.zhang@xxxxxxx" <Jesse.zhang@xxxxxxx>
>>
>> This patch updates the SDMA v4.4.2 software initialization to enable
>> per-queue reset support when the MEC firmware version is 0xb0 or
>> higher and the PMFW supports SDMA reset.
>>
>> The following changes are included:
>> - Added a condition to check if the MEC firmware version is at least 0xb0 and if
>>   the PMFW supports SDMA reset using `amdgpu_dpm_reset_sdma_is_supported`.
>> - If both conditions are met, the `AMDGPU_RESET_TYPE_PER_QUEUE` flag is set in
>>   `adev->sdma.supported_reset`.
>>
>> Suggested-by: Jonathan Kim <Jonathan.Kim@xxxxxxx>
>> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@xxxxxxx>
>> Signed-off-by: Jesse Zhang <jesse.zhang@xxxxxxx>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> index b24a1ff5d743..e01d97b96655 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> @@ -1481,9 +1481,10 @@ static int sdma_v4_4_2_sw_init(struct amdgpu_ip_block *ip_block)
>>               }
>>       }
>>
>> -     /* TODO: Add queue reset mask when FW fully supports it */
>>       adev->sdma.supported_reset =
>>
>> amdgpu_get_soft_full_reset_mask(&adev->sdma.instance[0].ring);
>> +     if (adev->gfx.mec_fw_version >= 0xb0 && amdgpu_dpm_reset_sdma_is_supported(adev))
>> +             adev->sdma.supported_reset |=
>> + AMDGPU_RESET_TYPE_PER_QUEUE;
>
> This function is reused across multiple IP versions. MEC fw versions aren't the same across those IP versions.
>
> In fact, the user queue relies on MEC fw and pmfw when the sdma queue do reset.
> So we need to check both of them at here  to skip old mec and pmfw.
>

To make it clear -
MEC FW >= 0xb0 is having reset support for say GC 9.4.3. With GC 9.5.0, MEC FW 0x20 may have the same support.
 Thanks Lijo. Will double check with MEC fw team about GC9.5.0

Thanks,
Lijo

> Thanks
> Jesse
>
> Thanks,
> Lijo
>
>>
>>       if (amdgpu_sdma_ras_sw_init(adev)) {
>>               dev_err(adev->dev, "fail to initialize sdma ras
>> block\n");
>





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux