Re: [PATCH 1/2] drm/amdgpu: Warn when bad pages approaches 90% threshold

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 2021-10-21 um 12:35 p.m. schrieb Luben Tuikov:
> On 2021-10-21 11:57, Kent Russell wrote:
>> dmesg doesn't warn when the number of bad pages approaches the
>> threshold for page retirement. WARN when the number of bad pages
>> is at 90% or greater for easier checks and planning, instead of waiting
>> until the GPU is full of bad pages.
>>
>> Cc: Luben Tuikov <luben.tuikov@xxxxxxx>
>> Cc: Mukul Joshi <Mukul.Joshi@xxxxxxx>
>> Signed-off-by: Kent Russell <kent.russell@xxxxxxx>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>> index f4c05ff4b26c..ce5089216474 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
>> @@ -1077,6 +1077,12 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control,
>>  		if (res)
>>  			DRM_ERROR("RAS table incorrect checksum or error:%d\n",
>>  				  res);
>> +
>> +		/* Warn if we are at 90% of the threshold or above */
> The kernel uses a couple of styles, this is one of them:
>
> /* Warn ...
>  */

That's a waste of space. That means there can be no single-line comments?

checkpatch.pl complains about multi-line comments that don't have the
final */ on their own line. But a single line comment is OK. Let's stick
with the coding style recommended by checkpatch.pl. If we make up our
own arbitrary rules for different reviewers and different parts of the
code, I think it becomes a mine-field of pointless cosmetic fixes for
anyone trespassing on unfamiliar code.


> if (...)
>
> Please use this style as it is used extensively in the amdgpu_ras_eeprom.c file.
>
>> +		if ((10 * control->ras_num_recs) >= (ras->bad_page_cnt_threshold * 9))
> You don't need the extra parenthesis around multiplication--it has higher precedence than relational operators--drop the extra parenthesis.

I agree. With that fixed, the patch is

Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>


>
> Regards,
> Luben
>
>> +			DRM_WARN("RAS records:%u exceeds 90%% of threshold:%d",
>> +					control->ras_num_recs,
>> +					ras->bad_page_cnt_threshold);
>>  	} else if (hdr->header == RAS_TABLE_HDR_BAD &&
>>  		   amdgpu_bad_page_threshold != 0) {
>>  		res = __verify_ras_table_checksum(control);



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux