Hi Boris, Am 02.05.23 um 13:19 schrieb Boris Brezillon:
Hello Christian, Alex, As part of our transition to drm_sched for the powervr GPU driver, we realized drm_sched_resubmit_jobs(), which is used by all drivers relying on drm_sched right except amdgpu, has been deprecated. Unfortunately, commit 5efbe6aa7a0e ("drm/scheduler: deprecate drm_sched_resubmit_jobs") doesn't describe what drivers should do or use as an alternative. At the very least, for our implementation, we need to restore the drm_sched_job::parent pointers that were set to NULL in drm_sched_stop(), such that jobs submitted before the GPU recovery are considered active when drm_sched_start() is called. That could be done with a custom pending_list iteration restoring drm_sched_job::parent's pointer, but that seems odd to let the scheduler backend manipulate this list directly, and I suspect we need to do other checks, like the karma vs hang-limit thing, so we can flag the entity dirty and cancel all jobs being queued there if the entity has caused too many hangs. Now that drm_sched_resubmit_jobs() has been deprecated, that would be great if you could help us write a piece of documentation describing what should be done between drm_sched_stop() and drm_sched_start(), so new drivers don't come up with their own slightly different/broken version of the same thing.
Yeah, really good point! The solution is to not use drm_sched_stop() and drm_sched_start() either.
The general idea Daniel, the other Intel guys and me seem to have agreed on is to convert the scheduler thread into a work item.
This work item for pushing jobs to the hw can then be queued to the same workqueue we use for the timeout work item.
If this workqueue is now configured by your driver as single threaded you have a guarantee that only either the scheduler or the timeout work item is running at the same time. That in turn makes starting/stopping the scheduler for a reset completely superfluous.
Patches for this has already been floating on the mailing list, but haven't been committed yet. Since this is all WIP.
In general it's not really a good idea to change the scheduler and hw fences during GPU reset/recovery. The dma_fence implementation has a pretty strict state transition which clearly say that a dma_fence should never go back from signaled to unsignaled and when you start messing with that this is exactly what might happen.
What you can do is to save your hw state and re-start at the same location after handling the timeout.
Regards, Christian.
Thanks in advance for your help. Regards, Boris