On 2022-03-17 12:04, Christian König wrote:
Am 17.03.22 um 16:10 schrieb Rob Clark:
[SNIP]
userspace frozen != kthread frozen .. that is what this patch is
trying to address, so we aren't racing between shutting down the hw
and the scheduler shoveling more jobs at us.
Well exactly that's the problem. The scheduler is supposed to
shoveling more jobs at us until it is empty.
Thinking more about it we will then keep some dma_fence instance
unsignaled and that is and extremely bad idea since it can lead to
deadlocks during suspend.
So this patch here is an absolute clear NAK from my side. If amdgpu is
doing something similar that is a severe bug and needs to be addressed
somehow.
From looking at latest amd-stagin-drm-next we only use directly
kthread_park during in debugfs IB hooks.
For S3 suspend (amdgpu_pmops_suspend) we will only flush all the HW
fences (amdgpu_fence_wait_empty) so we don't freeze the scheduler thread
and don't flush scheduler entities.
Andrey
Regards,
Christian.
BR,
-R