Currently UMD hasn't much information on what went wrong during a GPU reset. To help with that, this patch proposes a new IOCTL that can be used to query information about the resources that caused the hang. The goal of this RFC is to gather feedback about this interface. The mesa part can be found at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22785 The current implementation is racy, meaning that if two resets happens (even on different rings), the app will get the last reset information available, rather than the one that is looking for. Maybe this can be fixed with a ring_id parameter to query the information for a specific ring, but this also requires an interface to tell the UMD which ring caused it. I know that devcoredump is also used for this kind of information, but I believe that using an IOCTL is better for interfacing Mesa + Linux rather than parsing a file that its contents are subjected to be changed. André Almeida (1): drm/amdgpu: Add interface to dump guilty IB on GPU hang drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 7 ++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 + drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 29 ++++++++++++++++++++++++ include/uapi/drm/amdgpu_drm.h | 7 ++++++ 7 files changed, 52 insertions(+), 1 deletion(-) -- 2.40.1