Re: [PATCH v2] drm/amdgpu: Fix error handling in amdgpu_ras_add_bad_pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ping!?

On 12/17/2024 3:08 PM, Srinivasan Shanmugam wrote:
It ensures that appropriate error codes are returned when an error
condition is detected

Fixes the below;
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2849 amdgpu_ras_add_bad_pages() warn: missing error code here? 'amdgpu_umc_pages_in_a_row()' failed.
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2884 amdgpu_ras_add_bad_pages() warn: missing error code here? 'amdgpu_ras_mca2pa()' failed.

Fixes: 9fe61c21405a ("drm/amdgpu: parse legacy RAS bad page mixed with new data in various NPS modes")
Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
Cc: YiPeng Chai <yipeng.chai@xxxxxxx>
Cc: Tao Zhou <tao.zhou1@xxxxxxx>
Cc: Hawking Zhang <Hawking.Zhang@xxxxxxx>
Cc: Christian König <christian.koenig@xxxxxxx>
Cc: Alex Deucher <alexander.deucher@xxxxxxx>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@xxxxxxx>
---
v2:
  - s/-EIO/-EINVAL, retained the use of -EINVAL from
    amdgpu_umc_pages_in_a_row & and amdgpu_ras_mca2pa_by_idx, when the
    RAS context is not initialized or the convert_ras_err_addr function is
    unavailable. (Thomas)

  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 21 ++++++++++++++++-----
  1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 01c947066a2e..f1371d1f8421 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2832,8 +2832,10 @@ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev,
mutex_lock(&con->recovery_lock);
  	data = con->eh_data;
-	if (!data)
+	if (!data) {
+		ret = -EINVAL;
  		goto free;
+	}
for (i = 0; i < pages; i++) {
  		if (from_rom &&
@@ -2845,26 +2847,34 @@ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev,
  						 * one row
  						 */
  						if (amdgpu_umc_pages_in_a_row(adev, &err_data,
-								bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT))
+									      bps[i].retired_page <<
+									      AMDGPU_GPU_PAGE_SHIFT)) {
+							ret = -EINVAL;
  							goto free;
-						else
+						} else {
  							find_pages_per_pa = true;
+						}
  					} else {
  						/* unsupported cases */
+						ret = -EOPNOTSUPP;
  						goto free;
  					}
  				}
  			} else {
  				if (amdgpu_umc_pages_in_a_row(adev, &err_data,
-						bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT))
+						bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT)) {
+					ret = -EINVAL;
  					goto free;
+				}
  			}
  		} else {
  			if (from_rom && !find_pages_per_pa) {
  				if (bps[i].retired_page & UMC_CHANNEL_IDX_V2) {
  					/* bad page in any NPS mode in eeprom */
-					if (amdgpu_ras_mca2pa_by_idx(adev, &bps[i], &err_data))
+					if (amdgpu_ras_mca2pa_by_idx(adev, &bps[i], &err_data)) {
+						ret = -EINVAL;
  						goto free;
+					}
  				} else {
  					/* legacy bad page in eeprom, generated only in
  					 * NPS1 mode
@@ -2881,6 +2891,7 @@ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev,
  							/* non-nps1 mode, old RAS TA
  							 * can't support it
  							 */
+							ret = -EOPNOTSUPP;
  							goto free;
  						}
  					}



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux