On 2/13/19 2:21 PM, Grodzovsky, Andrey wrote: > > On 2/13/19 2:16 PM, Kazlauskas, Nicholas wrote: >> On 2/13/19 2:10 PM, Grodzovsky, Andrey wrote: >>> On 2/13/19 2:00 PM, Kazlauskas, Nicholas wrote: >>>> On 2/13/19 1:58 PM, Andrey Grodzovsky wrote: >>>>> When ring hang happens amdgpu_dm_commit_planes during flip is holding >>>>> the BO reserved and then stack waiting for fences to signal in >>>>> reservation_object_wait_timeout_rcu (which won't signal because there >>>>> was a hnag). Then when we try to shutdown display block during reset >>>>> recovery from drm_atomic_helper_suspend we also try to reserve the BO >>>>> from dm_plane_helper_cleanup_fb ending in deadlock. >>>>> Also remove useless WARN_ON >>>>> >>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> >>>>> --- >>>>> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 19 +++++++++++++------ >>>>> 1 file changed, 13 insertions(+), 6 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c >>>>> index acc4ff8..f8dec36 100644 >>>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c >>>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c >>>>> @@ -4802,14 +4802,21 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state, >>>>> */ >>>>> abo = gem_to_amdgpu_bo(fb->obj[0]); >>>>> r = amdgpu_bo_reserve(abo, true); >>>>> - if (unlikely(r != 0)) { >>>>> + if (unlikely(r != 0)) >>>>> DRM_ERROR("failed to reserve buffer before flip\n"); >>>>> - WARN_ON(1); >>>>> - } >>>>> >>>>> - /* Wait for all fences on this FB */ >>>>> - WARN_ON(reservation_object_wait_timeout_rcu(abo->tbo.resv, true, false, >>>>> - MAX_SCHEDULE_TIMEOUT) < 0); >>>>> + /* >>>>> + * Wait for all fences on this FB. Do limited wait to avoid >>>>> + * deadlock during GPU reset when this fence will not signal >>>>> + * but we hold reservation lock for the BO. >>>>> + */ >>>>> + r = reservation_object_wait_timeout_rcu(abo->tbo.resv, >>>>> + true, false, >>>>> + msecs_to_jiffies(5000)); >>>>> + if (unlikely(r == 0)) >>>>> + DRM_ERROR("Waiting for fences timed out."); >>>>> + >>>>> + >>>>> >>>>> amdgpu_bo_get_tiling_flags(abo, &tiling_flags); >>>>> >>>>> >>>> Is it safe that we're just continuing like this? It's probably better to >>>> just unreserve the buffer and continue to the next plane, no? >>>> >>>> Nicholas Kazlauskas >>> As far as I see it should be safe as you are simply flipping to a buffer >>> for which rendering hasn't finished (or stack actually in this case) so >>> you might see visual corruption but that the least of your problems if >>> after 5s the BO still not finalized for presentation, the system is >>> already probably in very bad shape. Also, in case we do want to do >>> error handling we should also take care of amdgpu_bo_reserve failure >>> just before that. >>> >>> Andrey >>> >>> >> Yeah, I guess this whole blocks needs to be cleaned up in that case. >> This is a good first step at least. Technically >> reservation_object_wait_timeout_rcu will return < 0 when it's been >> interrupted too as an error code but I guess that will just be silently >> ignored here. >> >> If you want you can change the condition to: >> >> if (unlikely(r >= 0)) >> DRM_ERROR("Waiting for FB fence failed: id=%d res=%d\n", >> plane->id, r); > > > Note that reservation_object_wait_timeout_rcu has a flag 'bool intr: if > true, do interruptible wait', we set it to false since the code in case > of flip runs form the kernel worker thread and not from IOCTL meaning we > are not in user mode context and hence are not going to recieve user > signals (cannot be interrupted). So the only values we can recieve here > are either 0 for time out or val > 0 for wait that completed before time > out value. > > Andrey Oh, right. This patch is good as-is then. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@xxxxxxx> > >> >> But with or without that change this patch is: >> >> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@xxxxxxx> > _______________________________________________ > amd-gfx mailing list > amd-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx