Yeah, and exactly that's the problem :) You need a global lock covering all schedulers. Otherwise you end up in hell's kitchen again with taking all those locks in the right order. Christian. Am 30.10.19 um 15:56 schrieb Grodzovsky, Andrey: > Can you elaborate on what is the tricky part with the lock ? I assumed > we just use per scheduler lock. > > Andrey > > On 10/30/19 10:50 AM, Christian König wrote: >> A lock inside the scheduler is rather tricky to implement. >> >> What you need to do is to get rid of the park()/unpark() hack in >> drm_sched_entity_fini(). >> >> We could do this with a struct completion or convert the scheduler >> from a thread to a work item. >> >> Regards, >> Christian. >> >> Am 30.10.19 um 15:44 schrieb Grodzovsky, Andrey: >>> That good as proof of RCA but I still think we should grab a dedicated >>> lock inside scheduler since the race is internal to scheduler code so >>> this better to handle it inside the scheduler code to make the fix apply >>> for all drivers using it. >>> >>> Andrey >>> >>> On 10/30/19 4:44 AM, S, Shirish wrote: >>>>>>> We still have it and isn't doing kthread_park()/unpark() from >>>>>>> drm_sched_entity_fini while GPU reset in progress defeats all the >>>>>>> purpose of drm_sched_stop->kthread_park ? If >>>>>>> drm_sched_entity_fini-> kthread_unpark happens AFTER >>>>>>> drm_sched_stop->kthread_park nothing prevents from another (third) >>>>>>> thread keep submitting job to HW which will be picked up by the >>>>>>> unparked scheduler thread try to submit to HW but fail because the >>>>>>> HW ring is deactivated. >>>>>>> >>>>>>> If so maybe we should serialize calls to >>>>>>> kthread_park/unpark(sched->thread) ? >>>>>>> >>>>>> Yeah, that was my thinking as well. Probably best to just grab the >>>>>> reset lock before calling drm_sched_entity_fini(). >>>>> Shirish - please try locking &adev->lock_reset around calls to >>>>> drm_sched_entity_fini as Christian suggests and see if this actually >>>>> helps the issue. >>>>> >>>> Yes that also works. >>>> >>>> Regards, >>>> >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx