Op 20-04-16 om 19:09 schreef John.C.Harrison@xxxxxxxxx: > From: John Harrison <John.C.Harrison@xxxxxxxxx> > > The intended usage model for struct fence is that the signalled status > should be set on demand rather than polled. That is, there should not > be a need for a 'signaled' function to be called everytime the status > is queried. Instead, 'something' should be done to enable a signal > callback from the hardware which will update the state directly. In > the case of requests, this is the seqno update interrupt. The idea is > that this callback will only be enabled on demand when something > actually tries to wait on the fence. > > This change removes the polling test and replaces it with the callback > scheme. Each fence is added to a 'please poke me' list at the start of > i915_add_request(). The interrupt handler then scans through the 'poke > me' list when a new seqno pops out and signals any matching > fence/request. The fence is then removed from the list so the entire > request stack does not need to be scanned every time. Note that the > fence is added to the list before the commands to generate the seqno > interrupt are added to the ring. Thus the sequence is guaranteed to be > race free if the interrupt is already enabled. > > Note that the interrupt is only enabled on demand (i.e. when > __wait_request() is called). Thus there is still a potential race when > enabling the interrupt as the request may already have completed. > However, this is simply solved by calling the interrupt processing > code immediately after enabling the interrupt and thereby checking for > already completed requests. > > Lastly, the ring clean up code has the possibility to cancel > outstanding requests (e.g. because TDR has reset the ring). These > requests will never get signalled and so must be removed from the > signal list manually. This is done by setting a 'cancelled' flag and > then calling the regular notify/retire code path rather than > attempting to duplicate the list manipulatation and clean up code in > multiple places. This also avoid any race condition where the > cancellation request might occur after/during the completion interrupt > actually arriving. > > v2: Updated to take advantage of the request unreference no longer > requiring the mutex lock. > > v3: Move the signal list processing around to prevent unsubmitted > requests being added to the list. This was occurring on Android > because the native sync implementation calls the > fence->enable_signalling API immediately on fence creation. > > Updated after review comments by Tvrtko Ursulin. Renamed list nodes to > 'link' instead of 'list'. Added support for returning an error code on > a cancelled fence. Update list processing to be more efficient/safer > with respect to spinlocks. > > v5: Made i915_gem_request_submit a static as it is only ever called > from one place. > > Fixed up the low latency wait optimisation. The time delay between the > seqno value being to memory and the drive's ISR running can be > significant, at least for the wait request micro-benchmark. This can > be greatly improved by explicitly checking for seqno updates in the > pre-wait busy poll loop. Also added some documentation comments to the > busy poll code. > > Fixed up support for the faking of lost interrupts > (test_irq_rings/missed_irq_rings). That is, there is an IGT test that > tells the driver to loose interrupts deliberately and then check that > everything still works as expected (albeit much slower). > > Updates from review comments: use non IRQ-save spinlocking, early exit > on WARN and improved comments (Tvrtko Ursulin). > > v6: Updated to newer nigthly and resolved conflicts around the > wait_request busy spin optimisation. Also fixed a race condition > between this early exit path and the regular completion path. > > v7: Updated to newer nightly - lots of ring -> engine renaming plus an > interface change on get_seqno(). Also added a list_empty() check > before acquring spinlocks and doing list processing. > > For: VIZ-5190 > Signed-off-by: John Harrison <John.C.Harrison@xxxxxxxxx> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_drv.h | 8 + > drivers/gpu/drm/i915/i915_gem.c | 249 +++++++++++++++++++++++++++++--- > drivers/gpu/drm/i915/i915_irq.c | 2 + > drivers/gpu/drm/i915/intel_lrc.c | 2 + > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 + > drivers/gpu/drm/i915/intel_ringbuffer.h | 2 + > 6 files changed, 241 insertions(+), 24 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 81324ba..9519b11 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -2255,7 +2255,12 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old, > struct drm_i915_gem_request { > /** Underlying object for implementing the signal/wait stuff. */ > struct fence fence; > + struct list_head signal_link; > + struct list_head unsignal_link; > struct list_head delayed_free_link; > + bool cancelled; > + bool irq_enabled; > + bool signal_requested; > > /** On Which ring this request was generated */ > struct drm_i915_private *i915; > @@ -2341,6 +2346,9 @@ struct drm_i915_gem_request * __must_check > i915_gem_request_alloc(struct intel_engine_cs *engine, > struct intel_context *ctx); > void i915_gem_request_cancel(struct drm_i915_gem_request *req); > +void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req, > + bool fence_locked); > +void i915_gem_request_notify(struct intel_engine_cs *ring, bool fence_locked); > > int i915_create_fence_timeline(struct drm_device *dev, > struct intel_context *ctx, > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 9071058..9b1de5f 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -40,6 +40,8 @@ > > #define RQ_BUG_ON(expr) > > +static void i915_gem_request_submit(struct drm_i915_gem_request *req); > + > static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj); > static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj); > static void > @@ -1253,9 +1255,8 @@ int __i915_wait_request(struct drm_i915_gem_request *req, > struct intel_engine_cs *engine = i915_gem_request_get_engine(req); > struct drm_device *dev = engine->dev; > struct drm_i915_private *dev_priv = dev->dev_private; > - const bool irq_test_in_progress = > - ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_engine_flag(engine); > int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE; > + uint32_t seqno; > DEFINE_WAIT(wait); > unsigned long timeout_expire; > s64 before = 0; /* Only to silence a compiler warning. */ > @@ -1263,9 +1264,6 @@ int __i915_wait_request(struct drm_i915_gem_request *req, > > WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled"); > > - if (list_empty(&req->list)) > - return 0; > - > if (i915_gem_request_completed(req)) > return 0; > > @@ -1291,15 +1289,17 @@ int __i915_wait_request(struct drm_i915_gem_request *req, > trace_i915_gem_request_wait_begin(req); > > /* Optimistic spin for the next jiffie before touching IRQs */ > - ret = __i915_spin_request(req, state); > - if (ret == 0) > - goto out; > - > - if (!irq_test_in_progress && WARN_ON(!engine->irq_get(engine))) { > - ret = -ENODEV; > - goto out; > + if (req->seqno) { > + ret = __i915_spin_request(req, state); > + if (ret == 0) > + goto out; > } > > + /* > + * Enable interrupt completion of the request. > + */ > + fence_enable_sw_signaling(&req->fence); > + > for (;;) { > struct timer_list timer; > > @@ -1321,6 +1321,21 @@ int __i915_wait_request(struct drm_i915_gem_request *req, > break; > } > > + if (req->seqno) { > + /* > + * There is quite a lot of latency in the user interrupt > + * path. So do an explicit seqno check and potentially > + * remove all that delay. > + */ > + if (req->engine->irq_seqno_barrier) > + req->engine->irq_seqno_barrier(req->engine); > + seqno = engine->get_seqno(engine); You're using this barrier + get_seqno pattern multiple times, maybe write a helper function for this? > + if (i915_seqno_passed(seqno, req->seqno)) { > + ret = 0; > + break; > + } > + } > + > if (signal_pending_state(state, current)) { > ret = -ERESTARTSYS; > break; > @@ -1347,14 +1362,32 @@ int __i915_wait_request(struct drm_i915_gem_request *req, > destroy_timer_on_stack(&timer); > } > } > - if (!irq_test_in_progress) > - engine->irq_put(engine); > > finish_wait(&engine->irq_queue, &wait); > > out: > trace_i915_gem_request_wait_end(req); > > + if ((ret == 0) && (req->seqno)) { > + if (req->engine->irq_seqno_barrier) > + req->engine->irq_seqno_barrier(req->engine); > + seqno = engine->get_seqno(engine); > + if (i915_seqno_passed(seqno, req->seqno) && > + !i915_gem_request_completed(req)) { > + /* > + * Make sure the request is marked as completed before > + * returning. NB: Need to acquire the spinlock around > + * the whole call to avoid a race condition with the > + * interrupt handler is running concurrently and could > + * cause this invocation to early exit even though the > + * request has not actually been fully processed yet. > + */ > + spin_lock_irq(&req->engine->fence_lock); > + i915_gem_request_notify(req->engine, true); > + spin_unlock_irq(&req->engine->fence_lock); > + } > + } > + > if (timeout) { > s64 tres = *timeout - (ktime_get_raw_ns() - before); > > @@ -1432,6 +1465,22 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request) > list_del_init(&request->list); > i915_gem_request_remove_from_client(request); > > + /* > + * In case the request is still in the signal pending list, > + * e.g. due to being cancelled by TDR, preemption, etc. > + */ > + if (!list_empty(&request->signal_link)) { > + /* > + * The request must be marked as cancelled and the underlying > + * fence as failed. NB: There is no explicit fence fail API, > + * there is only a manual poke and signal. > + */ > + request->cancelled = true; > + /* How to propagate to any associated sync_fence??? */ > + request->fence.status = -EIO; > + fence_signal_locked(&request->fence); > + } > Can this still happen with commit aa9b78104fe3210758fa9e6c644e9a108d371e8b Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Date: Wed Apr 13 17:35:15 2016 +0100 drm/i915: Late request cancellations are harmful ? _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx