During driver's probe, when it hits bad gpu tag in eeprom i2c init calling(the tag was set when reported bad page reaches bad page threshold in last driver's working loop), there are some strategys to deal with the cases: 1. when the module parameter amdgpu_bad_page_threshold = 0, that means page retirement feature is disabled, so just resetting the eeprom is fine. 2. When amdgpu_bad_page_threshold is not 0, and moreover, user sets one bigger valid value in order to make current boot up succeeds, reset the eeprom data and do not break booting. 3. For other cases, driver's probe will be broken. Signed-off-by: Guchun Chen <guchun.chen@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index be895dc2d739..02933050081b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -248,6 +248,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control, struct amdgpu_device *adev = to_amdgpu_device(control); unsigned char buff[EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE] = { 0 }; struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr; + struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); struct i2c_msg msg = { .addr = 0, .flags = I2C_M_RD, @@ -287,9 +288,15 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control, } else if ((hdr->header == EEPROM_TABLE_HDR_BAD) && (amdgpu_bad_page_threshold != 0)) { - *exceed_err_limit = true; - DRM_ERROR("Exceeding the bad_page_threshold parameter, " + if (ras->bad_page_cnt_threshold > control->num_recs) { + DRM_INFO("One valid bigger bad page threshold is " + "used, reset eeprom.\n"); + ret = amdgpu_ras_eeprom_reset_table(control); + } else { + *exceed_err_limit = true; + DRM_ERROR("Exceeding the bad_page_threshold parameter, " "disabling the GPU.\n"); + } } else { DRM_INFO("Creating new EEPROM table"); -- 2.17.1 _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx