[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Hawking Zhang <Hawking.Zhang@xxxxxxx> Per discussion, please have a separated patch to replace all the “DRM_INFO” with “dev_info” in per IP query_ras_error_count callback function so that we will have clear picture on which errors are from which nodes when harvest all the RAS
errors in one gpu recovery worker. Regards, From: Clements, John <John.Clements@xxxxxxx> [AMD Official Use Only - Internal Distribution Only] Submitting patch to resolve issue when upon receiving an uncorrectable ras error, RAS ISR gets triggered on all GPU node creating a race condition between querying the RAS errors and entering the GPU reset sequence |
_______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx