[AMD Official Use Only] Thanks! Reviewed-by: Hawking Zhang <Hawking.Zhang@xxxxxxx> Regards, Hawking -----Original Message----- From: Stanley.Yang <Stanley.Yang@xxxxxxx> Sent: Friday, November 26, 2021 21:26 To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhang, Hawking <Hawking.Zhang@xxxxxxx>; Clements, John <John.Clements@xxxxxxx>; Zhou1, Tao <Tao.Zhou1@xxxxxxx>; Li, Candice <Candice.Li@xxxxxxx>; Chai, Thomas <YiPeng.Chai@xxxxxxx> Cc: Yang, Stanley <Stanley.Yang@xxxxxxx> Subject: [PATCH Review 1/1] drm/amdgpu: fix disable ras feature failed when unload drvier v2 v2: still need call ras_disable_all_featrures to handle ras initilization failure case. Function amdgpu_device_fini_hw is called before amdgpu_device_fini_sw, so ras ta will unload before send ras disable command, ras dsiable operation must before hw fini. Signed-off-by: Stanley.Yang <Stanley.Yang@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 1 - 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 73ec46140d68..d5e642e90010 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2838,8 +2838,6 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev) if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done) amdgpu_virt_release_ras_err_handler_data(adev); - amdgpu_ras_pre_fini(adev); - if (adev->gmc.xgmi.num_physical_nodes > 1) amdgpu_xgmi_remove_device(adev); @@ -3959,6 +3957,9 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) amdgpu_fbdev_fini(adev); + /* disable ras feature must before hw fini */ + amdgpu_ras_pre_fini(adev); + amdgpu_device_ip_fini_early(adev); amdgpu_irq_fini_hw(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 39dfd4d59881..9400029ac3bf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -2484,7 +2484,6 @@ void amdgpu_ras_late_fini(struct amdgpu_device *adev, amdgpu_ras_sysfs_remove(adev, ras_block); if (ih_info->cb) amdgpu_ras_interrupt_remove_handler(adev, ih_info); - amdgpu_ras_feature_enable(adev, ras_block, 0); } /* do some init work after IP late init as dependence. -- 2.17.1