On Wed, Aug 07, 2013 at 10:29:05PM +0100, Chris Wilson wrote: > We use the request to ensure we hold a reference to the context for the > duration that it remains in use by the ring. Each request only holds a > reference to the current context, hence we emit a request after > switching contexts with the final reference to the old context. However, > the extra interrupt caused by that request is not useful (no timing > critical function will wait for the context object), instead the overhead > of servicing the IRQ shows up in some (lightweight) benchmarks. In order > to keep the useful property of using the request to manage the context > lifetime, we want to add a dummy request that is associated with the > interrupt from the subsequent real request following the batch. > > The extra interrupt was added as a side-effect of using > i915_add_request() in > > commit 112522f6789581824903f6f72082b5b841a7f0f9 > Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Date: Thu May 2 16:48:07 2013 +0300 > > drm/i915: put context upon switching > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Ben Widawsky <ben@xxxxxxxxxxxx> > Cc: Paulo Zanoni <paulo.r.zanoni@xxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_drv.h | 1 + > drivers/gpu/drm/i915/i915_gem.c | 32 +++++++++++++++++++++++++++++++- > drivers/gpu/drm/i915/i915_gem_context.c | 2 +- > 3 files changed, 33 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 9a8fac3..e822390 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1873,6 +1873,7 @@ int __i915_add_request(struct intel_ring_buffer *ring, > struct drm_file *file, > struct drm_i915_gem_object *batch_obj, > u32 *seqno); > +int __i915_add_request_marker(struct intel_ring_buffer *ring); > #define i915_add_request(ring, seqno) \ > __i915_add_request(ring, NULL, NULL, seqno) > int __must_check i915_wait_seqno(struct intel_ring_buffer *ring, > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 0fa45e0..7568bf0 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2110,6 +2110,37 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno) > return 0; > } > > +int > +__i915_add_request_marker(struct intel_ring_buffer *ring) > +{ > + struct drm_i915_gem_request *request; > + > + request = kmalloc(sizeof(*request), GFP_KERNEL); > + if (request == NULL) > + return -ENOMEM; > + > + request->ring = ring; > + request->seqno = intel_ring_get_seqno(ring); > + request->tail = request->head = intel_ring_get_tail(ring); > + request->batch_obj = NULL; > + > + /* Whilst this request exists, batch_obj will be on the > + * active_list, and so will hold the active reference. Only when this > + * request is retired will the the batch_obj be moved onto the > + * inactive_list and lose its active reference. Hence we do not need > + * to explicitly hold another reference here. > + */ > + request->ctx = ring->last_context; > + if (request->ctx) > + i915_gem_context_reference(request->ctx); > + > + request->emitted_jiffies = jiffies; > + list_add_tail(&request->list, &ring->request_list); > + request->file_priv = NULL; > + > + return 0; > +} > + > int __i915_add_request(struct intel_ring_buffer *ring, > struct drm_file *file, > struct drm_i915_gem_object *obj, > @@ -2137,7 +2168,6 @@ int __i915_add_request(struct intel_ring_buffer *ring, > if (request == NULL) > return -ENOMEM; > > - > /* Record the position of the start of the request so that > * should we detect the updated seqno part-way through the > * GPU processing the request, we never over-estimate the > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > index db94aca..88a9425 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -466,7 +466,7 @@ static int do_switch(struct i915_hw_context *to) > from->obj->dirty = 1; > BUG_ON(from->obj->ring != ring); > > - ret = i915_add_request(ring, NULL); > + ret = __i915_add_request_marker(ring); Can't we just ditch the add_request here? The actual backing storage of the old context won't get reaped too early (due to the oustanding lazy request logic) and as long as we don't schedule a batch we don't care one bit that there's no request with a context linked to it. Only once we emit batches we care, since otherwise the hangcheck code can't assign the blame. Ripping out the add_request would also allow us to ditch the scary WARN + mi_set_context stuff below. Or do I massively miss something here? -Daniel > if (ret) { > /* Too late, we've already scheduled a context switch. > * Try to undo the change so that the hw state is > -- > 1.8.4.rc1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx