[PATCH 4/4] drm/amdgpu: Implement ignore_bad_page_threshold parameter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If the ignore_bad_page_threshold kernel parameter is set to true,
continue to post the GPU. Print an warning to dmesg that this action has
been done, and that page retirement will obviously not work for said GPU

Cc: Luben Tuikov <luben.tuikov@xxxxxxx>
Cc: Mukul Joshi <Mukul.Joshi@xxxxxxx>
Signed-off-by: Kent Russell <kent.russell@xxxxxxx>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 7bb506a0ebd6..63a0548a05bf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -1108,11 +1108,16 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
 			res = amdgpu_ras_eeprom_correct_header_tag(control,
 								   RAS_TABLE_HDR_VAL);
 		} else {
-			*exceed_err_limit = true;
-			dev_err(adev->dev,
-				"RAS records:%d exceed threshold:%d, "
-				"GPU will not be initialized. Replace this GPU or increase the threshold",
+			dev_err(adev->dev, "RAS records:%d exceed threshold:%d",
 				control->ras_num_recs, ras->bad_page_cnt_threshold);
+			if (amdgpu_ignore_bad_page_threshold) {
+				dev_warn(adev->dev, "GPU will be initialized due to ignore_bad_page_threshold.");
+				dev_warn(adev->dev, "Page retirement will not work for this GPU in this state.");
+				res = 0;
+			} else {
+				*exceed_err_limit = true;
+				dev_err(adev->dev, "GPU will not be initialized. Replace this GPU or increase the threshold.");
+			}
 		}
 	} else {
 		DRM_INFO("Creating a new EEPROM table");
-- 
2.25.1




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux