On 4/12/22 22:40, Andrey Grodzovsky wrote: > > On 2022-04-12 14:20, Dmitry Osipenko wrote: >> On 4/12/22 19:51, Andrey Grodzovsky wrote: >>> On 2022-04-11 18:15, Dmitry Osipenko wrote: >>>> Interrupt context can't sleep. Drivers like Panfrost and MSM are taking >>>> mutex when job is released, and thus, that code can sleep. This results >>>> into "BUG: scheduling while atomic" if locks are contented while job is >>>> freed. There is no good reason for releasing scheduler's jobs in IRQ >>>> context, hence use normal context to fix the trouble. >>> >>> I am not sure this is the beast Idea to leave job's sw fence signalling >>> to be >>> executed in system_wq context which is prone to delays of executing >>> various work items from around the system. Seems better to me to >>> leave the >>> fence signaling within the IRQ context and offload only the job >>> freeing or, >>> maybe handle rescheduling to thread context within drivers implemention >>> of .free_job cb. Not really sure which is the better. >> We're talking here about killing jobs when driver destroys context, >> which doesn't feel like it needs to be a fast path. I could move the >> signalling into drm_sched_entity_kill_jobs_cb() and use unbound wq, but >> do we really need this for a slow path? > > > You can't move the signaling back to drm_sched_entity_kill_jobs_cb > since this will bring back the lockdep splat that 'drm/sched: Avoid > lockdep spalt on killing a processes' > was fixing. Indeed > I see your point and i guess we can go this way too. Another way would > be to add to > panfrost and msm job a work_item and reschedule to thread context from > within their > .free_job callbacks but that probably to cumbersome to be justified here. Yes, there is no clear justification for doing that. > Andrey > > > Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> Thank you!