On 18/11/2016 13:33, Chris Wilson wrote:
On Fri, Nov 18, 2016 at 03:03:21PM +0200, Mika Kuoppala wrote:
Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> writes:
On 18/11/2016 09:37, Chris Wilson wrote:
Since the submit/execute split in commit d55ac5bf97c6 ("drm/i915: Defer
transfer onto execution timeline to actual hw submission") the
global seqno advance was deferred until the submit_request callback.
After wedging the GPU, we were installing a nop_submit_request handler
(to avoid waking up the dead hw) but I had missed converting this over
to the new scheme. Under the new scheme, we have to explicitly call
i915_gem_submit_request() from the submit_request handler to mark the
request as on the hardware. If we don't the request is always pending,
and any waiter will continue to wait indefinitely and hangcheck will not
be able to resolve the lockup.
References: https://bugs.freedesktop.org/show_bug.cgi?id=98748
Testcase: igt/gem_eio/in-flight
Fixes: d55ac5bf97c6 ("drm/i915: Defer transfer onto execution timeline to actual hw submission")
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
drivers/gpu/drm/i915/i915_gem.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7b9f5b99b0f3..7037a8b26903 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2762,6 +2762,8 @@ void i915_gem_reset(struct drm_i915_private *dev_priv)
static void nop_submit_request(struct drm_i915_gem_request *request)
{
+ i915_gem_request_submit(request);
+ intel_engine_init_global_seqno(request->engine, request->global_seqno);
Slight deja vu but not sure - we don't have a way of marking these as
failed so what happens in practice here? This as at the point of no
return, no replay, or allowing the context to recover or something?
This is past the point of no return. We failed to reset the hardware, so
we need to catch all the inflight requests and signal them. Treating
them in flight is nasty as they are a part of a giant web of
dependencies, so I wanted to have them just complete quietly as they
became ready (to be careful that we don't start dependent third party
work before other third party work finishes).
Right, I wanted to confirm if I was reading it correctly. In this case:
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx