Hi Hawking, Please check this v2 patch which is just sent out. And as discussed, we decide to still leverage the current reset_method() function with functionality/change scale/code maintainability balanced . Thanks. Regards, Ma Le -----Original Message----- From: Zhang, Hawking <Hawking.Zhang@xxxxxxx> Sent: Wednesday, November 27, 2019 7:39 PM To: Ma, Le <Le.Ma@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Chen, Guchun <Guchun.Chen@xxxxxxx>; Zhou1, Tao <Tao.Zhou1@xxxxxxx>; Li, Dennis <Dennis.Li@xxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Ma, Le <Le.Ma@xxxxxxx> Subject: RE: [PATCH 06/10] drm/amdgpu: add condition to enable baco for xgmi/ras case [AMD Public Use] And It is still necessary to put all the condition check in a function. I mean a function that decide to go ras recovery or legacy fatal_error handling. The PMFW version that support RAS recovery will be different among ASICs. Current version check only works for VG20. In fact, once ras->supported is set and proper PMFW is detected, RAS recovery will be the best choice no matter it is sGPU or mGPU. Regards, Hawking -----Original Message----- From: Le Ma <le.ma@xxxxxxx> Sent: 2019年11月27日 17:15 To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Zhang, Hawking <Hawking.Zhang@xxxxxxx>; Chen, Guchun <Guchun.Chen@xxxxxxx>; Zhou1, Tao <Tao.Zhou1@xxxxxxx>; Li, Dennis <Dennis.Li@xxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Ma, Le <Le.Ma@xxxxxxx> Subject: [PATCH 06/10] drm/amdgpu: add condition to enable baco for xgmi/ras case Avoid to change default reset behavior for production card by checking amdgpu_ras_enable equal to 2. And only new enough smu ucode can support baco for xgmi/ras case. Change-Id: I07c3e6862be03e068745c73db8ea71f428ecba6b Signed-off-by: Le Ma <le.ma@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/soc15.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index 951327f..6202333 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -577,7 +577,9 @@ soc15_asic_reset_method(struct amdgpu_device *adev) struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev, 0); struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); - if (hive || (ras && ras->supported)) + if ((hive || (ras && ras->supported)) && + (amdgpu_ras_enable != 2 || + adev->pm.fw_version <= 0x283400)) baco_reset = false; } break; -- 2.7.4 _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx