Re: [PATCH] drm/i915: Do not add an interrupt for a context switch

Daniel Vetter <daniel@xxxxxxxx> · Thu, 8 Aug 2013 01:18:18 +0200

On Thu, Aug 08, 2013 at 12:10:22AM +0100, Chris Wilson wrote:
> On Wed, Aug 07, 2013 at 11:58:21PM +0200, Daniel Vetter wrote:
> > On Wed, Aug 07, 2013 at 10:29:05PM +0100, Chris Wilson wrote:
> > > We use the request to ensure we hold a reference to the context for the
> > > duration that it remains in use by the ring. Each request only holds a
> > > reference to the current context, hence we emit a request after
> > > switching contexts with the final reference to the old context. However,
> > > the extra interrupt caused by that request is not useful (no timing
> > > critical function will wait for the context object), instead the overhead
> > > of servicing the IRQ shows up in some (lightweight) benchmarks. In order
> > > to keep the useful property of using the request to manage the context
> > > lifetime, we want to add a dummy request that is associated with the
> > > interrupt from the subsequent real request following the batch.
> > > 
> > > The extra interrupt was added as a side-effect of using
> > > i915_add_request() in
> > > 
> > > commit 112522f6789581824903f6f72082b5b841a7f0f9
> > > Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > > Date:   Thu May 2 16:48:07 2013 +0300
> > > 
> > >     drm/i915: put context upon switching
> > > 
> > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > > Cc: Ben Widawsky <ben@xxxxxxxxxxxx>
> > > Cc: Paulo Zanoni <paulo.r.zanoni@xxxxxxxxx>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.h         |  1 +
> > >  drivers/gpu/drm/i915/i915_gem.c         | 32 +++++++++++++++++++++++++++++++-
> > >  drivers/gpu/drm/i915/i915_gem_context.c |  2 +-
> > >  3 files changed, 33 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 9a8fac3..e822390 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1873,6 +1873,7 @@ int __i915_add_request(struct intel_ring_buffer *ring,
> > >  		       struct drm_file *file,
> > >  		       struct drm_i915_gem_object *batch_obj,
> > >  		       u32 *seqno);
> > > +int __i915_add_request_marker(struct intel_ring_buffer *ring);
> > >  #define i915_add_request(ring, seqno) \
> > >  	__i915_add_request(ring, NULL, NULL, seqno)
> > >  int __must_check i915_wait_seqno(struct intel_ring_buffer *ring,
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 0fa45e0..7568bf0 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -2110,6 +2110,37 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno)
> > >  	return 0;
> > >  }
> > >  
> > > +int
> > > +__i915_add_request_marker(struct intel_ring_buffer *ring)
> > > +{
> > > +	struct drm_i915_gem_request *request;
> > > +
> > > +	request = kmalloc(sizeof(*request), GFP_KERNEL);
> > > +	if (request == NULL)
> > > +		return -ENOMEM;
> > > +
> > > +	request->ring = ring;
> > > +	request->seqno = intel_ring_get_seqno(ring);
> > > +	request->tail = request->head = intel_ring_get_tail(ring);
> > > +	request->batch_obj = NULL;
> > > +
> > > +	/* Whilst this request exists, batch_obj will be on the
> > > +	 * active_list, and so will hold the active reference. Only when this
> > > +	 * request is retired will the the batch_obj be moved onto the
> > > +	 * inactive_list and lose its active reference. Hence we do not need
> > > +	 * to explicitly hold another reference here.
> > > +	 */
> > > +	request->ctx = ring->last_context;
> > > +	if (request->ctx)
> > > +		i915_gem_context_reference(request->ctx);
> > > +
> > > +	request->emitted_jiffies = jiffies;
> > > +	list_add_tail(&request->list, &ring->request_list);
> > > +	request->file_priv = NULL;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  int __i915_add_request(struct intel_ring_buffer *ring,
> > >  		       struct drm_file *file,
> > >  		       struct drm_i915_gem_object *obj,
> > > @@ -2137,7 +2168,6 @@ int __i915_add_request(struct intel_ring_buffer *ring,
> > >  	if (request == NULL)
> > >  		return -ENOMEM;
> > >  
> > > -
> > >  	/* Record the position of the start of the request so that
> > >  	 * should we detect the updated seqno part-way through the
> > >  	 * GPU processing the request, we never over-estimate the
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > > index db94aca..88a9425 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > > @@ -466,7 +466,7 @@ static int do_switch(struct i915_hw_context *to)
> > >  		from->obj->dirty = 1;
> > >  		BUG_ON(from->obj->ring != ring);
> > >  
> > > -		ret = i915_add_request(ring, NULL);
> > > +		ret = __i915_add_request_marker(ring);
> > 
> > Can't we just ditch the add_request here? The actual backing storage of
> > the old context won't get reaped too early (due to the oustanding lazy
> > request logic) and as long as we don't schedule a batch we don't care one
> > bit that there's no request with a context linked to it. Only once we emit
> > batches we care, since otherwise the hangcheck code can't assign the
> > blame.
> 
> The request holds the last reference to the context->obj, and each
> request only has a single context slot.  The olr and next batch only
> preserves ring->last_context which is now the new context. So we insert
> this dummy context into the request queue for the sole purpose of
> holding the reference until the real request (and new context) adds the
> breadcrumb.

Afaict the request holds a ref on the context, and that a ref on the hw
context bo. But the active list itself (thanks to the move_to_active)
should also hold a ref to the hw context bo, so I don't think we really
need full request here. The old context might disappear, but no one really
cares if it disappears a notch too early since the backing storage will
survive long enough.

And the new context won't go away since the ring itself also holds a ref.
So with just a bit of code-reading I don't see a problem, but I also
havent' tried it.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx