[Public]
Does the message need a mention about the newly added option to ignore threshold?
Lijo
From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> on behalf of Luben Tuikov <luben.tuikov@xxxxxxx>
Sent: Monday, October 25, 2021 9:32:20 PM
To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
Cc: Tuikov, Luben <Luben.Tuikov@xxxxxxx>; Russell, Kent <Kent.Russell@xxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx>
Subject: [PATCH] drm/amdgpu: Restore information reporting in RAS
Sent: Monday, October 25, 2021 9:32:20 PM
To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
Cc: Tuikov, Luben <Luben.Tuikov@xxxxxxx>; Russell, Kent <Kent.Russell@xxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx>
Subject: [PATCH] drm/amdgpu: Restore information reporting in RAS
A recent patch took away the reporting of number of RAS records and
the threshold due to the way it was edited/spliced on top of the code.
This patch restores this reporting.
Cc: Kent Russell <kent.russell@xxxxxxx>
Cc: Alex Deucher <Alexander.Deucher@xxxxxxx>
Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring bad page threshold")
Signed-off-by: Luben Tuikov <luben.tuikov@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index ae64ca02ccc4f8..05117eda105b55 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
res = 0;
} else {
*exceed_err_limit = true;
- dev_err(adev->dev, "GPU will not be initialized. Replace this GPU or increase the threshold.");
+ dev_err(adev->dev,
+ "RAS records:%d exceed threshold:%d, "
+ "GPU will not be initialized. Replace this GPU or increase the threshold",
+ control->ras_num_recs, ras->bad_page_cnt_threshold);
}
}
} else {
base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d
--
2.33.1.558.g2bd2f258f4
the threshold due to the way it was edited/spliced on top of the code.
This patch restores this reporting.
Cc: Kent Russell <kent.russell@xxxxxxx>
Cc: Alex Deucher <Alexander.Deucher@xxxxxxx>
Fixes: 07df2fb092d09e ("drm/amdgpu: Add kernel parameter support for ignoring bad page threshold")
Signed-off-by: Luben Tuikov <luben.tuikov@xxxxxxx>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index ae64ca02ccc4f8..05117eda105b55 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -1112,7 +1112,10 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
res = 0;
} else {
*exceed_err_limit = true;
- dev_err(adev->dev, "GPU will not be initialized. Replace this GPU or increase the threshold.");
+ dev_err(adev->dev,
+ "RAS records:%d exceed threshold:%d, "
+ "GPU will not be initialized. Replace this GPU or increase the threshold",
+ control->ras_num_recs, ras->bad_page_cnt_threshold);
}
}
} else {
base-commit: b60bccb408c831c685b2a257eff575bcda2cbe9d
--
2.33.1.558.g2bd2f258f4