RE: [PATCH] drm/amdgpu: resolve mGPU RAS query instability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - Internal Distribution Only]

 

Reviewed-by: Hawking Zhang <Hawking.Zhang@xxxxxxx>

 

Per discussion, please have a separated patch to replace all the “DRM_INFO” with “dev_info” in per IP query_ras_error_count callback function so that we will have clear picture on which errors are from which nodes when harvest all the RAS errors in one gpu recovery worker.

 

Regards,
Hawking

From: Clements, John <John.Clements@xxxxxxx>
Sent: Tuesday, April 7, 2020 11:03
To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhang, Hawking <Hawking.Zhang@xxxxxxx>; Chen, Guchun <Guchun.Chen@xxxxxxx>; Li, Dennis <Dennis.Li@xxxxxxx>; Zhou1, Tao <Tao.Zhou1@xxxxxxx>
Subject: [PATCH] drm/amdgpu: resolve mGPU RAS query instability

 

[AMD Official Use Only - Internal Distribution Only]

 

Submitting patch to resolve issue when upon receiving an uncorrectable ras error, RAS ISR gets triggered on all GPU node creating a race condition between querying the RAS errors and entering the GPU reset sequence

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux