On Tue, Dec 08, 2015 at 02:03:39PM +0000, Tvrtko Ursulin wrote: > On 08/12/15 10:44, Chris Wilson wrote: > >On Mon, Dec 07, 2015 at 03:08:28PM +0000, Tvrtko Ursulin wrote: > >>Equally, why wouldn't we wake up all waiters for which the requests > >>have been completed? > > > >Because we no longer track the requests to be completed, having moved to > >a chain of waiting processes instead of a chain of requests. I could > >insert a waitqueue into the intel_breadcrumb and that would indeed > >necessitate locking in the irq handler (and irq locks everywhere :(. > > You have a tree of seqnos each with a wait->task and current seqno > on the engine can be queried. So I don't see where is the problem? The "problem" is that every process will do its own post-irq seqno check after being woken up, then grab the common spinlock to remove itself from the tree. We could avoid that by using RB_NODE_EMPTY shortcircuiting I suppose. > >>Would be a cheap check here and it would save a cascading growing > >>latency as one task wakes up the next one. > > > >Well, it can't be here since we may remove_waiter after a signal > >(incomplete wait). So this part has to walk the chain of processes. Ugh, > >and have to move the waitqueue from one waiter to the next... > > Ok on interrupted waiters it makes no sense, but on normal waiter > removal it would just mean comparing engine->get_seqno() vs the > first waiter seqno and waking up all until the uncompleted one is > found? It's a tradeoff whether it can be written more neatly than: diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index be76086..474636f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1279,6 +1279,9 @@ int __i915_wait_request(struct drm_i915_gem_request *req, break; } + if (RB_EMPTY_NODE(&wait.node)) + break; + set_task_state(wait.task, state); /* Before we do the heavier coherent read of the seqno, @@ -1312,7 +1315,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req, break; } out: - intel_engine_remove_breadcrumb(req->ring, &wait); + intel_engine_remove_breadcrumb(req->ring, &wait, ret); __set_task_state(wait.task, TASK_RUNNING); trace_i915_gem_request_wait_end(req); diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c index ae3ee3c..421c214 100644 --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c @@ -169,10 +169,14 @@ void intel_engine_enable_fake_irq(struct intel_engine_cs *engine) } void intel_engine_remove_breadcrumb(struct intel_engine_cs *engine, - struct intel_breadcrumb *wait) + struct intel_breadcrumb *wait, + int ret) { struct intel_breadcrumbs *b = &engine->breadcrumbs; + if (RB_EMPTY_NODE(&wait->node)) + return; + spin_lock(&b->lock); if (b->first_waiter == wait->task) { @@ -187,6 +191,18 @@ void intel_engine_remove_breadcrumb(struct intel_engine_cs *engine, * completion check. */ next = rb_next(&wait->node); + if (ret == 0) { + u32 seqno = intel_ring_get_seqno(engine); + while (next && + i915_seqno_passed(seqno, + to_crumb(next)->seqno)) { + struct rb_node *n = rb_next(next); + wake_up_process(next->task); + rb_erase(next, &b->requests); + RB_CLEAR_NODE(next); + next = n; + } + } task = next ? to_crumb(next)->task : NULL; b->first_waiter = task; My biggest complaint is that we are mixing request-complete and direct evaluation of i915_seqno_passed now. (I expect that we can tidy the code up somewhat.) -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx