On 2/14/2025 12:14 PM, Zhang, Jesse(Jie) wrote: > [AMD Official Use Only - AMD Internal Distribution Only] > > Hi Lijo, > -----Original Message----- > From: Lazar, Lijo <Lijo.Lazar@xxxxxxx> > Sent: Friday, February 14, 2025 2:10 PM > To: Zhang, Jesse(Jie) <Jesse.Zhang@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Kim, Jonathan <Jonathan.Kim@xxxxxxx>; Zhu, Jiadong <Jiadong.Zhu@xxxxxxx>; Prosyak, Vitaly <Vitaly.Prosyak@xxxxxxx> > Subject: Re: [PATCH 2/2] drm/amdgpu: Enable per-queue reset support > > > > On 2/14/2025 11:25 AM, jesse.zhang@xxxxxxx wrote: >> From: "Jesse.zhang@xxxxxxx" <Jesse.zhang@xxxxxxx> >> >> This patch updates the SDMA v4.4.2 software initialization to enable >> per-queue reset support when the MEC firmware version is 0xb0 or >> higher and the PMFW supports SDMA reset. >> >> The following changes are included: >> - Added a condition to check if the MEC firmware version is at least 0xb0 and if >> the PMFW supports SDMA reset using `amdgpu_dpm_reset_sdma_is_supported`. >> - If both conditions are met, the `AMDGPU_RESET_TYPE_PER_QUEUE` flag is set in >> `adev->sdma.supported_reset`. >> >> Suggested-by: Jonathan Kim <Jonathan.Kim@xxxxxxx> >> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@xxxxxxx> >> Signed-off-by: Jesse Zhang <jesse.zhang@xxxxxxx> >> --- >> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c >> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c >> index b24a1ff5d743..e01d97b96655 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c >> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c >> @@ -1481,9 +1481,10 @@ static int sdma_v4_4_2_sw_init(struct amdgpu_ip_block *ip_block) >> } >> } >> >> - /* TODO: Add queue reset mask when FW fully supports it */ >> adev->sdma.supported_reset = >> amdgpu_get_soft_full_reset_mask(&adev->sdma.instance[0].ring); >> + if (adev->gfx.mec_fw_version >= 0xb0 && amdgpu_dpm_reset_sdma_is_supported(adev)) >> + adev->sdma.supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE; > > This function is reused across multiple IP versions. MEC fw versions aren't the same across those IP versions. > > In fact, the user queue relies on MEC fw and pmfw when the sdma queue do reset. > So we need to check both of them at here to skip old mec and pmfw. > To make it clear - MEC FW >= 0xb0 is having reset support for say GC 9.4.3. With GC 9.5.0, MEC FW 0x20 may have the same support. Thanks, Lijo > Thanks > Jesse > > Thanks, > Lijo > >> >> if (amdgpu_sdma_ras_sw_init(adev)) { >> dev_err(adev->dev, "fail to initialize sdma ras block\n"); >