Dear Dennis,
Am 10.07.20 um 10:39 schrieb Li, Dennis:
I used our internal tool to make GPU hang and do stress test.
Interesting. I want to have such a tool. ;-)
So you noticed it during testing with that tool, and not by somebody
experiencing this in production?
In kernel, when GPU hang, driver has multi-paths to enter
amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset is used
to avoid re-entering GPU recovery. During GPU reset and resume, it
is unsafe that other threads access GPU, which maybe cause GPU reset
failed. Therefore the new rw_semaphore adev->reset_sem is
introduced, which protect GPU from being accessed by external
threads when doing recovery.
Thank you for the explanation. It’d be great if you added this to the
commit message.
Kind regards,
Paul
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx