On Mon, 6 May 2019 at 17:34, Koenig, Christian <Christian.Koenig@xxxxxxx> wrote: > > That won't work. The kernel can't wait for spawned processes to finish > because it is holding locks. > > The script could as last operation trigger a manual reset, but that > would not be the same as a timeout reset because you don't know the > cause of it and would always need to do a full engine reset. > > Sorry, but what you are suggesting here (collect data and then reset) is > not easily doable. > I am understand, but I am really liked how it implemented in intel driver. For example after gpu hang all debug data available by path /sys/class/drm/card0/error [ 512.296756] i915 0000:00:02.0: GPU HANG: ecode 7:1:0xfffffffe, in gnome-shell [1753], hang on rcs0 [ 512.296761] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 512.296762] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 512.296763] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 512.296764] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 512.296766] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 512.296875] i915 0000:00:02.0: Resetting chip for hang on rcs0 [ 563.280960] i915 0000:00:02.0: Resetting chip for hang on rcs0 [ 571.281666] i915 0000:00:02.0: Resetting chip for hang on rcs0 -- Best Regards, Mike Gavrilov. _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx