On 26/11/2014 14:42, Daniel Vetter wrote:
On Wed, Nov 26, 2014 at 3:12 PM, John Harrison
<John.C.Harrison@xxxxxxxxx> wrote:
diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
index 8bfdac6..831fae2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3130,4 +3130,11 @@ wait_remaining_ms_from_jiffies(unsigned long
timestamp_jiffies, int to_wait_ms)
}
}
+static inline void i915_trace_irq_get(struct intel_engine_cs *ring,
+ struct drm_i915_gem_request *req)
+{
+ if (ring->trace_irq_req == NULL && ring->irq_get(ring))
+ i915_gem_request_assign(&ring->trace_irq_req, req);
This looks a bit suspiciuos. Thus far ring->trace_irq_req was essentially
ot protected at all by anything. But that was nothing troublesome since we
didn't hang a real resource of it.
But now there's a refcounted request in that pointer, which means if we
race we leak. I'll skip this patch for now.
-Daniel
Race how? The assignment only ever occurs from inside execbuffer submission
at which point the driver mutex lock is held. Therefore it is very
definitely protected. Not doing the reference count means that there is now
the possibility of a dangling pointer and thus the possibility of going bang
with a kernel oops.
Hm, ->trace_irq_seqno is indeed always touched from the a calling
context with dev->struct_mutex held. Somehow I've misrembered that
since the realtime/tracing folks are really freaked out about what
we're doing here. But from that pov your patch doesn't really make
things worse, so I'll pull it in.
Btw I don't see the oops really without this patch. What would blow up?
-Daniel
The sole access (and clear to null) of the trace pointer is done from
retire requests after the requests have been retired. Thus the request
structure could have just been freed immediately before it is used. The
code could be re-ordered to be safer but I'm not entirely sure what the
trace pointer is for or what it might potentially be used for in the
future. With the reference counting, the ordering is irrelevant. If the
pointer exists then it is safe to use.
The point is that anywhere that keeps a copy of a request pointer really
should reference count that copy. Otherwise there is the possibility
that the pointer could become stale. Either now or with future code
changes. If the copy is always done with the request_assign() function
then the pointer is guaranteed safe for all time.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx