On 04/24/2018 12:14 PM, Eric W. Biederman wrote: > Andrey Grodzovsky <andrey.grodzovsky at amd.com> writes: > >> If the ring is hanging for some reason allow to recover the waiting >> by sending fatal signal. >> >> Originally-by: David Panariti <David.Panariti at amd.com> >> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 14 ++++++++++---- >> 1 file changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c >> index eb80edf..37a36af 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c >> @@ -421,10 +421,16 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx, unsigned ring_id) >> >> if (other) { >> signed long r; >> - r = dma_fence_wait_timeout(other, false, MAX_SCHEDULE_TIMEOUT); >> - if (r < 0) { >> - DRM_ERROR("Error (%ld) waiting for fence!\n", r); >> - return r; >> + >> + while (true) { >> + if ((r = dma_fence_wait_timeout(other, true, >> + MAX_SCHEDULE_TIMEOUT)) >= 0) >> + return 0; >> + >> + if (fatal_signal_pending(current)) { >> + DRM_ERROR("Error (%ld) waiting for fence!\n", r); >> + return r; >> + } > It looks like if you make this code say: > if (fatal_signal_pending(current) || > (current->flags & PF_EXITING)) { > DRM_ERROR("Error (%ld) waiting for fence!\n", r); > return r; >> } >> } > Than you would not need the horrible hack want_signal to deliver signals > to processes who have passed exit_signal() and don't expect to need > their signal handling mechanisms anymore. Let me clarify, the change in want_signal wasn't addressing this code but hang in drm_sched_entity_do_release->wait_event_killable, when you try to gracefully terminate by waiting for all job completions on the GPU pipe you process is using. Andrey > > Eric >