On Wed, Nov 26, 2014 at 02:58:49PM +0000, John Harrison wrote: > On 26/11/2014 14:42, Daniel Vetter wrote: > >On Wed, Nov 26, 2014 at 3:12 PM, John Harrison > ><John.C.Harrison@xxxxxxxxx> wrote: > >>>>diff --git a/drivers/gpu/drm/i915/i915_drv.h > >>>>b/drivers/gpu/drm/i915/i915_drv.h > >>>>index 8bfdac6..831fae2 100644 > >>>>--- a/drivers/gpu/drm/i915/i915_drv.h > >>>>+++ b/drivers/gpu/drm/i915/i915_drv.h > >>>>@@ -3130,4 +3130,11 @@ wait_remaining_ms_from_jiffies(unsigned long > >>>>timestamp_jiffies, int to_wait_ms) > >>>> } > >>>> } > >>>> +static inline void i915_trace_irq_get(struct intel_engine_cs *ring, > >>>>+ struct drm_i915_gem_request *req) > >>>>+{ > >>>>+ if (ring->trace_irq_req == NULL && ring->irq_get(ring)) > >>>>+ i915_gem_request_assign(&ring->trace_irq_req, req); > >>>This looks a bit suspiciuos. Thus far ring->trace_irq_req was essentially > >>>ot protected at all by anything. But that was nothing troublesome since we > >>>didn't hang a real resource of it. > >>> > >>>But now there's a refcounted request in that pointer, which means if we > >>>race we leak. I'll skip this patch for now. > >>>-Daniel > >> > >>Race how? The assignment only ever occurs from inside execbuffer submission > >>at which point the driver mutex lock is held. Therefore it is very > >>definitely protected. Not doing the reference count means that there is now > >>the possibility of a dangling pointer and thus the possibility of going bang > >>with a kernel oops. > >Hm, ->trace_irq_seqno is indeed always touched from the a calling > >context with dev->struct_mutex held. Somehow I've misrembered that > >since the realtime/tracing folks are really freaked out about what > >we're doing here. But from that pov your patch doesn't really make > >things worse, so I'll pull it in. > > > >Btw I don't see the oops really without this patch. What would blow up? > >-Daniel > > The sole access (and clear to null) of the trace pointer is done from retire > requests after the requests have been retired. Thus the request structure > could have just been freed immediately before it is used. The code could be > re-ordered to be safer but I'm not entirely sure what the trace pointer is > for or what it might potentially be used for in the future. With the > reference counting, the ordering is irrelevant. If the pointer exists then > it is safe to use. > > The point is that anywhere that keeps a copy of a request pointer really > should reference count that copy. Otherwise there is the possibility that > the pointer could become stale. Either now or with future code changes. If > the copy is always done with the request_assign() function then the pointer > is guaranteed safe for all time. Oh, I guess you've misunderstood what I've done. Ive dropped the entire patch here, not just the refcounting. Dropping the refcounting alone is obviously oops-worthy. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx