Re: [PATCH 46/55] drm/i915: Update intel_ring_begin() to take a request structure

Daniel Vetter <daniel@xxxxxxxx> · Tue, 23 Jun 2015 17:34:56 +0200

On Tue, Jun 23, 2015 at 04:27:45PM +0100, John Harrison wrote:
> On 23/06/2015 14:25, Daniel Vetter wrote:
> >On Tue, Jun 23, 2015 at 11:37:53AM +0100, John Harrison wrote:
> >>On 23/06/2015 11:24, Chris Wilson wrote:
> >>>On Fri, May 29, 2015 at 05:44:07PM +0100, John.C.Harrison@xxxxxxxxx wrote:
> >>>>-int intel_ring_begin(struct intel_engine_cs *ring,
> >>>>+int intel_ring_begin(struct drm_i915_gem_request *req,
> >>>>  		     int num_dwords)
> >>>>  {
> >>>>-	struct drm_i915_gem_request *req;
> >>>>-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> >>>>+	struct intel_engine_cs *ring;
> >>>>+	struct drm_i915_private *dev_priv;
> >>>>  	int ret;
> >>>>+	WARN_ON(req == NULL);
> >>>>+	ring = req->ring;
> >>>What was the point?
> >>>-Chris
> >>>
> >>The point is to remove the OLR. The significant change within
> >>intel_ring_begin is the next few lines:
> >>
> >>-	/* Preallocate the olr before touching the ring */
> >>-	ret = i915_gem_request_alloc(ring, ring->default_context, &req);
> >>
> >>
> >>That is the part that causes problems by randomly creating a brand new
> >>request that no-one knows about and squirreling it away in the OLR to scoop
> >>up random bits of work. This is the whole point of the entire patch series -
> >>to ensure that all ring work is assigned to a known request by whoever
> >>instigated that work.
> >I think the point was that a WARN_ON(pointer) followed by an immediate
> >deref of said pointer doesn't add value. This is the one case where a
> >BUG_ON is the right joice. Or just let the kernel oops on the NULL deref
> >without warning first that it'll happen.
> >-Daniel
> I thought the edict from yourself was that BUG_ONs should never be used, it
> should always be a WARN_ON and if the driver oopses afterwards then so be
> it. The WARN_ON does add value in that it gives you a line number (and file)
> which in turn gives you exactly what variable was null. Whereas the oops
> just gives you an offset into a function. Unless you happen to have the
> exact same binary that generated the bug report, the chances of identifying
> what was null can be slim.

I might have gone over the top with screaming against WARN_ONs, so let me
clarify: By default use WARN_ON since if there's a decent chance that the
driver can limp along that greatly improves the chances that devs or users
can get useful backtraces out of a dying machine.

But if you can prove that the driver will oops anyway you can add a BUG_ON
to improve debug output for these cases.

Personally I'm on the fence whether instead a if (WARN_ON()) return
-EINVAL is better or not, I leave that to patch authors. But WARN_ON +
Oops is a bit too much.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx