Re: [PATCH] drm/amdgpu: guard ib scheduling while in reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



A lock inside the scheduler is rather tricky to implement.

What you need to do is to get rid of the park()/unpark() hack in drm_sched_entity_fini().

We could do this with a struct completion or convert the scheduler from a thread to a work item.

Regards,
Christian.

Am 30.10.19 um 15:44 schrieb Grodzovsky, Andrey:
That good  as proof of RCA but I still think we should grab a dedicated
lock inside scheduler since the race is internal to scheduler code so
this better to handle it inside the scheduler code to make the fix apply
for all drivers using it.

Andrey

On 10/30/19 4:44 AM, S, Shirish wrote:
We still have it and isn't doing kthread_park()/unpark() from
drm_sched_entity_fini while GPU reset in progress defeats all the
purpose of drm_sched_stop->kthread_park ? If
drm_sched_entity_fini-> kthread_unpark happens AFTER
drm_sched_stop->kthread_park nothing prevents from another (third)
thread keep submitting job to HW which will be picked up by the
unparked scheduler thread try to submit to HW but fail because the
HW ring is deactivated.

If so maybe we should serialize calls to
kthread_park/unpark(sched->thread) ?

Yeah, that was my thinking as well. Probably best to just grab the
reset lock before calling drm_sched_entity_fini().

Shirish - please try locking &adev->lock_reset around calls to
drm_sched_entity_fini as Christian suggests and see if this actually
helps the issue.

Yes that also works.

Regards,

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux