On 30/05/2018 11:55, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2018-05-30 11:43:16)
On 29/05/2018 14:29, Chris Wilson wrote:
Since we use i915_gem_find_active_request() from inside
intel_engine_dump() and may call that at any time, we do not guarantee
that the engine is paused nor that the signal kthreads and irq handler
are suspended, so we cannot assert that the breadcrumb doesn't advance
and that the irq hasn't happened on another CPU signaling the request we
believe to be idle.
Fixes: f636edb214a5 ("drm/i915: Make i915_engine_info pretty printer to standalone")
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx>
Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
---
drivers/gpu/drm/i915/i915_gem.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 05f44ca35a06..530d6d0109b4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2972,23 +2972,22 @@ i915_gem_find_active_request(struct intel_engine_cs *engine)
struct i915_request *request, *active = NULL;
unsigned long flags;
- /* We are called by the error capture and reset at a random
- * point in time. In particular, note that neither is crucially
- * ordered with an interrupt. After a hang, the GPU is dead and we
- * assume that no more writes can happen (we waited long enough for
- * all writes that were in transaction to be flushed) - adding an
+ /*
+ * We are called by the error capture, reset and to dump engine
+ * state at random points in time. In particular, note that neither is
+ * crucially ordered with an interrupt. After a hang, the GPU is dead
+ * and we assume that no more writes can happen (we waited long enough
+ * for all writes that were in transaction to be flushed) - adding an
* extra delay for a recent interrupt is pointless. Hence, we do
* not need an engine->irq_seqno_barrier() before the seqno reads.
+ * At all other times, we must assume the GPU is still running, but
+ * we only care about the snapshot of this moment.
*/
spin_lock_irqsave(&engine->timeline.lock, flags);
list_for_each_entry(request, &engine->timeline.requests, link) {
if (__i915_request_completed(request, request->global_seqno))
continue;
- GEM_BUG_ON(request->engine != engine);
Removal of this one belongs to virtual engine perhaps?
Nah, even with virtual engine; the engine timeline list is still true. I
was just thinking it was too random to check here (e.g. in the middle of
error capture) and that our more recent addition of checking during
retirement was a little more rigorous (and timely).
It is a random check indeed. Commit message append: "While at it remove
a random assert on..."
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx