On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@xxxxxxxxx wrote: > From: John Harrison <John.C.Harrison@xxxxxxxxx> > > It is a bad idea for i915_add_request() to fail. The work will already have been > send to the ring and will be processed, but there will not be any tracking or > management of that work. > > The only way the add request call can fail is if it can't write its epilogue > commands to the ring (cache flushing, seqno updates, interrupt signalling). The > reasons for that are mostly down to running out of ring buffer space and the > problems associated with trying to get some more. This patch prevents that > situation from happening in the first place. > > When a request is created, it marks sufficient space as reserved for the > epilogue commands. Thus guaranteeing that by the time the epilogue is written, > there will be plenty of space for it. Note that a ring_begin() call is required > to actually reserve the space (and do any potential waiting). However, that is > not currently done at request creation time. This is because the ring_begin() > code can allocate a request. Hence calling begin() from the request allocation > code would lead to infinite recursion! Later patches in this series remove the > need for begin() to do the allocate. At that point, it becomes safe for the > allocate to call begin() and really reserve the space. > > Until then, there is a potential for insufficient space to be available at the > point of calling i915_add_request(). However, that would only be in the case > where the request was created and immediately submitted without ever calling > ring_begin() and adding any work to that request. Which should never happen. And > even if it does, and if that request happens to fall down the tiny window of > opportunity for failing due to being out of ring space then does it really > matter because the request wasn't doing anything in the first place? > > v2: Updated the 'reserved space too small' warning to include the offending > sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added > re-initialisation of tracking state after a buffer wrap to keep the sanity > checks accurate. > > v3: Incremented the reserved size to accommodate Ironlake (after finally > managing to run on an ILK system). Also fixed missing wrap code in LRC mode. > > v4: Added extra comment and removed duplicate WARN (feedback from Tomas). > > v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from > Daniel Vetter). This didn't actually implement what I suggested (wrapping is the worst case, hence skipping the check for that is breaking the sanity check) and so changed the patch from "correct, but a bit fragile" to broken. I've merged the previous version instead. -Daniel > > For: VIZ-5115 > CC: Tomas Elf <tomas.elf@xxxxxxxxx> > CC: Daniel Vetter <daniel@xxxxxxxx> > Signed-off-by: John Harrison <John.C.Harrison@xxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_drv.h | 1 + > drivers/gpu/drm/i915/i915_gem.c | 37 ++++++++++++ > drivers/gpu/drm/i915/intel_lrc.c | 35 +++++++++-- > drivers/gpu/drm/i915/intel_ringbuffer.c | 98 +++++++++++++++++++++++++++++-- > drivers/gpu/drm/i915/intel_ringbuffer.h | 25 ++++++++ > 5 files changed, 186 insertions(+), 10 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 0347eb9..eba1857 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -2187,6 +2187,7 @@ struct drm_i915_gem_request { > > int i915_gem_request_alloc(struct intel_engine_cs *ring, > struct intel_context *ctx); > +void i915_gem_request_cancel(struct drm_i915_gem_request *req); > void i915_gem_request_free(struct kref *req_ref); > > static inline uint32_t > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 81f3512..85fa27b 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2485,6 +2485,13 @@ int __i915_add_request(struct intel_engine_cs *ring, > } else > ringbuf = ring->buffer; > > + /* > + * To ensure that this call will not fail, space for its emissions > + * should already have been reserved in the ring buffer. Let the ring > + * know that it is time to use that space up. > + */ > + intel_ring_reserved_space_use(ringbuf); > + > request_start = intel_ring_get_tail(ringbuf); > /* > * Emit any outstanding flushes - execbuf can fail to emit the flush > @@ -2567,6 +2574,9 @@ int __i915_add_request(struct intel_engine_cs *ring, > round_jiffies_up_relative(HZ)); > intel_mark_busy(dev_priv->dev); > > + /* Sanity check that the reserved size was large enough. */ > + intel_ring_reserved_space_end(ringbuf); > + > return 0; > } > > @@ -2666,6 +2676,26 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring, > if (ret) > goto err; > > + /* > + * Reserve space in the ring buffer for all the commands required to > + * eventually emit this request. This is to guarantee that the > + * i915_add_request() call can't fail. Note that the reserve may need > + * to be redone if the request is not actually submitted straight > + * away, e.g. because a GPU scheduler has deferred it. > + * > + * Note further that this call merely notes the reserve request. A > + * subsequent call to *_ring_begin() is required to actually ensure > + * that the reservation is available. Without the begin, if the > + * request creator immediately submitted the request without adding > + * any commands to it then there might not actually be sufficient > + * room for the submission commands. Unfortunately, the current > + * *_ring_begin() implementations potentially call back here to > + * i915_gem_request_alloc(). Thus calling _begin() here would lead to > + * infinite recursion! Until that back call path is removed, it is > + * necessary to do a manual _begin() outside. > + */ > + intel_ring_reserved_space_reserve(req->ringbuf, MIN_SPACE_FOR_ADD_REQUEST); > + > ring->outstanding_lazy_request = req; > return 0; > > @@ -2674,6 +2704,13 @@ err: > return ret; > } > > +void i915_gem_request_cancel(struct drm_i915_gem_request *req) > +{ > + intel_ring_reserved_space_cancel(req->ringbuf); > + > + i915_gem_request_unreference(req); > +} > + > struct drm_i915_gem_request * > i915_gem_find_active_request(struct intel_engine_cs *ring) > { > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index 6a5ed07..bd62bd6 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -690,6 +690,9 @@ static int logical_ring_wait_for_space(struct intel_ringbuffer *ringbuf, > if (intel_ring_space(ringbuf) >= bytes) > return 0; > > + /* The whole point of reserving space is to not wait! */ > + WARN_ON(ringbuf->reserved_in_use); > + > list_for_each_entry(request, &ring->request_list, list) { > /* > * The request queue is per-engine, so can contain requests > @@ -748,8 +751,12 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf, > int rem = ringbuf->size - ringbuf->tail; > > if (ringbuf->space < rem) { > - int ret = logical_ring_wait_for_space(ringbuf, ctx, rem); > + int ret; > + > + /* Can't wait if space has already been reserved! */ > + WARN_ON(ringbuf->reserved_in_use); > > + ret = logical_ring_wait_for_space(ringbuf, ctx, rem); > if (ret) > return ret; > } > @@ -768,7 +775,7 @@ static int logical_ring_wrap_buffer(struct intel_ringbuffer *ringbuf, > static int logical_ring_prepare(struct intel_ringbuffer *ringbuf, > struct intel_context *ctx, int bytes) > { > - int ret; > + int ret, max_bytes; > > if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) { > ret = logical_ring_wrap_buffer(ringbuf, ctx); > @@ -776,8 +783,28 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf, > return ret; > } > > - if (unlikely(ringbuf->space < bytes)) { > - ret = logical_ring_wait_for_space(ringbuf, ctx, bytes); > + /* > + * Add on the reserved size to the request to make sure that after > + * the intended commands have been emitted, there is guaranteed to > + * still be enough free space to send them to the hardware. > + */ > + max_bytes = bytes + ringbuf->reserved_size; > + > + if (unlikely(ringbuf->space < max_bytes)) { > + /* > + * Bytes is guaranteed to fit within the tail of the buffer, > + * but the reserved space may push it off the end. If so then > + * need to wait for the whole of the tail plus the reserved > + * size. That should guarantee that the actual request > + * (bytes) will fit between here and the end and the reserved > + * usage will fit either in the same or at the start. Either > + * way, if a wrap occurs it will not involve a wait and thus > + * cannot fail. > + */ > + if (unlikely(ringbuf->tail + max_bytes + I915_RING_FREE_SPACE > ringbuf->effective_size)) > + max_bytes = ringbuf->reserved_size + I915_RING_FREE_SPACE + ringbuf->size - ringbuf->tail; > + > + ret = logical_ring_wait_for_space(ringbuf, ctx, max_bytes); > if (unlikely(ret)) > return ret; > } > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > index d934f85..1c125e9 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -2106,6 +2106,9 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n) > if (intel_ring_space(ringbuf) >= n) > return 0; > > + /* The whole point of reserving space is to not wait! */ > + WARN_ON(ringbuf->reserved_in_use); > + > list_for_each_entry(request, &ring->request_list, list) { > space = __intel_ring_space(request->postfix, ringbuf->tail, > ringbuf->size); > @@ -2131,7 +2134,12 @@ static int intel_wrap_ring_buffer(struct intel_engine_cs *ring) > int rem = ringbuf->size - ringbuf->tail; > > if (ringbuf->space < rem) { > - int ret = ring_wait_for_space(ring, rem); > + int ret; > + > + /* Can't wait if space has already been reserved! */ > + WARN_ON(ringbuf->reserved_in_use); > + > + ret = ring_wait_for_space(ring, rem); > if (ret) > return ret; > } > @@ -2180,11 +2188,69 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request) > return 0; > } > > -static int __intel_ring_prepare(struct intel_engine_cs *ring, > - int bytes) > +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size) > +{ > + /* NB: Until request management is fully tidied up and the OLR is > + * removed, there are too many ways for get false hits on this > + * anti-recursion check! */ > + /*WARN_ON(ringbuf->reserved_size);*/ > + WARN_ON(ringbuf->reserved_in_use); > + > + ringbuf->reserved_size = size; > + > + /* > + * Really need to call _begin() here but that currently leads to > + * recursion problems! This will be fixed later but for now just > + * return and hope for the best. Note that there is only a real > + * problem if the create of the request never actually calls _begin() > + * but if they are not submitting any work then why did they create > + * the request in the first place? > + */ > +} > + > +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf) > +{ > + WARN_ON(ringbuf->reserved_in_use); > + > + ringbuf->reserved_size = 0; > + ringbuf->reserved_in_use = false; > +} > + > +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf) > +{ > + WARN_ON(ringbuf->reserved_in_use); > + > + ringbuf->reserved_in_use = true; > + ringbuf->reserved_tail = ringbuf->tail; > +} > + > +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf) > +{ > + WARN_ON(!ringbuf->reserved_in_use); > + if (ringbuf->tail > ringbuf->reserved_tail) { > + WARN(ringbuf->tail > ringbuf->reserved_tail + ringbuf->reserved_size, > + "request reserved size too small: %d vs %d!\n", > + ringbuf->tail - ringbuf->reserved_tail, ringbuf->reserved_size); > + } else { > + /* > + * The ring was wrapped while the reserved space was in use. > + * That means that some unknown amount of the ring tail was > + * no-op filled and skipped. Thus simply adding the ring size > + * to the tail and doing the above space check will not work. > + * Rather than attempt to track how much tail was skipped, > + * it is much simpler to say that also skipping the sanity > + * check every once in a while is not a big issue. > + */ > + } > + > + ringbuf->reserved_size = 0; > + ringbuf->reserved_in_use = false; > +} > + > +static int __intel_ring_prepare(struct intel_engine_cs *ring, int bytes) > { > struct intel_ringbuffer *ringbuf = ring->buffer; > - int ret; > + int ret, max_bytes; > > if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) { > ret = intel_wrap_ring_buffer(ring); > @@ -2192,8 +2258,28 @@ static int __intel_ring_prepare(struct intel_engine_cs *ring, > return ret; > } > > - if (unlikely(ringbuf->space < bytes)) { > - ret = ring_wait_for_space(ring, bytes); > + /* > + * Add on the reserved size to the request to make sure that after > + * the intended commands have been emitted, there is guaranteed to > + * still be enough free space to send them to the hardware. > + */ > + max_bytes = bytes + ringbuf->reserved_size; > + > + if (unlikely(ringbuf->space < max_bytes)) { > + /* > + * Bytes is guaranteed to fit within the tail of the buffer, > + * but the reserved space may push it off the end. If so then > + * need to wait for the whole of the tail plus the reserved > + * size. That should guarantee that the actual request > + * (bytes) will fit between here and the end and the reserved > + * usage will fit either in the same or at the start. Either > + * way, if a wrap occurs it will not involve a wait and thus > + * cannot fail. > + */ > + if (unlikely(ringbuf->tail + max_bytes > ringbuf->effective_size)) > + max_bytes = ringbuf->reserved_size + I915_RING_FREE_SPACE + ringbuf->size - ringbuf->tail; > + > + ret = ring_wait_for_space(ring, max_bytes); > if (unlikely(ret)) > return ret; > } > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h > index 39f6dfc..bf2ac28 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -105,6 +105,9 @@ struct intel_ringbuffer { > int space; > int size; > int effective_size; > + int reserved_size; > + int reserved_tail; > + bool reserved_in_use; > > /** We track the position of the requests in the ring buffer, and > * when each is retired we increment last_retired_head as the GPU > @@ -450,4 +453,26 @@ intel_ring_get_request(struct intel_engine_cs *ring) > return ring->outstanding_lazy_request; > } > > +/* > + * Arbitrary size for largest possible 'add request' sequence. The code paths > + * are complex and variable. Empirical measurement shows that the worst case > + * is ILK at 136 words. Reserving too much is better than reserving too little > + * as that allows for corner cases that might have been missed. So the figure > + * has been rounded up to 160 words. > + */ > +#define MIN_SPACE_FOR_ADD_REQUEST 160 > + > +/* > + * Reserve space in the ring to guarantee that the i915_add_request() call > + * will always have sufficient room to do its stuff. The request creation > + * code calls this automatically. > + */ > +void intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size); > +/* Cancel the reservation, e.g. because the request is being discarded. */ > +void intel_ring_reserved_space_cancel(struct intel_ringbuffer *ringbuf); > +/* Use the reserved space - for use by i915_add_request() only. */ > +void intel_ring_reserved_space_use(struct intel_ringbuffer *ringbuf); > +/* Finish with the reserved space - for use by i915_add_request() only. */ > +void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf); > + > #endif /* _INTEL_RINGBUFFER_H_ */ > -- > 1.7.9.5 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx