Quoting Tvrtko Ursulin (2021-01-26 16:42:37) > > On 26/01/2021 16:26, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2021-01-26 16:22:58) > >> > >> > >> On 25/01/2021 14:01, Chris Wilson wrote: > >>> The core of the scheduling algorithm is that we compute the topological > >>> order of the fence DAG. Knowing that we have a DAG, we should be able to > >>> use a DFS to compute the topological sort in linear time. However, > >>> during the conversion of the recursive algorithm into an iterative one, > >>> the memoization of how far we had progressed down a branch was > >>> forgotten. The result was that instead of running in linear time, it was > >>> running in geometric time and could easily run for a few hundred > >>> milliseconds given a wide enough graph, not the microseconds as required. > >>> > >>> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >>> --- > >>> drivers/gpu/drm/i915/i915_scheduler.c | 58 ++++++++++++++++----------- > >>> 1 file changed, 34 insertions(+), 24 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c > >>> index 4802c9b1081d..9139a91f0aa3 100644 > >>> --- a/drivers/gpu/drm/i915/i915_scheduler.c > >>> +++ b/drivers/gpu/drm/i915/i915_scheduler.c > >>> @@ -234,6 +234,26 @@ void __i915_priolist_free(struct i915_priolist *p) > >>> kmem_cache_free(global.slab_priorities, p); > >>> } > >>> > >>> +static struct i915_request * > >>> +stack_push(struct i915_request *rq, > >>> + struct i915_request *stack, > >>> + struct list_head *pos) > >>> +{ > >>> + stack->sched.dfs.prev = pos; > >>> + rq->sched.dfs.next = (struct list_head *)stack; > >>> + return rq; > >>> +} > >>> + > >>> +static struct i915_request * > >>> +stack_pop(struct i915_request *rq, > >>> + struct list_head **pos) > >>> +{ > >>> + rq = (struct i915_request *)rq->sched.dfs.next; > >>> + if (rq) > >>> + *pos = rq->sched.dfs.prev; > >>> + return rq; > >>> +} > >>> + > >>> static inline bool need_preempt(int prio, int active) > >>> { > >>> /* > >>> @@ -298,11 +318,10 @@ static void ipi_priority(struct i915_request *rq, int prio) > >>> static void __i915_request_set_priority(struct i915_request *rq, int prio) > >>> { > >>> struct intel_engine_cs *engine = rq->engine; > >>> - struct i915_request *rn; > >>> + struct list_head *pos = &rq->sched.signalers_list; > >>> struct list_head *plist; > >>> - LIST_HEAD(dfs); > >>> > >>> - list_add(&rq->sched.dfs, &dfs); > >>> + plist = i915_sched_lookup_priolist(engine, prio); > >>> > >>> /* > >>> * Recursively bump all dependent priorities to match the new request. > >>> @@ -322,40 +341,31 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio) > >>> * end result is a topological list of requests in reverse order, the > >>> * last element in the list is the request we must execute first. > >>> */ > >>> - list_for_each_entry(rq, &dfs, sched.dfs) { > >>> - struct i915_dependency *p; > >>> - > >>> - /* Also release any children on this engine that are ready */ > >>> - GEM_BUG_ON(rq->engine != engine); > >>> - > >>> - for_each_signaler(p, rq) { > >>> + rq->sched.dfs.next = NULL; > >>> + do { > >>> + list_for_each_continue(pos, &rq->sched.signalers_list) { > >>> + struct i915_dependency *p = > >>> + list_entry(pos, typeof(*p), signal_link); > >>> struct i915_request *s = > >>> container_of(p->signaler, typeof(*s), sched); > >>> > >>> - GEM_BUG_ON(s == rq); > >>> - > >>> if (rq_prio(s) >= prio) > >>> continue; > >>> > >>> if (__i915_request_is_complete(s)) > >>> continue; > >>> > >>> - if (s->engine != rq->engine) { > >>> + if (s->engine != engine) { > >>> ipi_priority(s, prio); > >>> continue; > >>> } > >>> > >>> - list_move_tail(&s->sched.dfs, &dfs); > >>> + /* Remember our position along this branch */ > >>> + rq = stack_push(s, rq, pos); > >>> + pos = &rq->sched.signalers_list; > >>> } > >>> - } > >>> > >>> - plist = i915_sched_lookup_priolist(engine, prio); > >>> - > >>> - /* Fifo and depth-first replacement ensure our deps execute first */ > >>> - list_for_each_entry_safe_reverse(rq, rn, &dfs, sched.dfs) { > >>> - GEM_BUG_ON(rq->engine != engine); > >>> - > >>> - INIT_LIST_HEAD(&rq->sched.dfs); > >>> + RQ_TRACE(rq, "set-priority:%d\n", prio); > >>> WRITE_ONCE(rq->sched.attr.priority, prio); > >>> > >>> /* > >>> @@ -369,12 +379,13 @@ static void __i915_request_set_priority(struct i915_request *rq, int prio) > >>> if (!i915_request_is_ready(rq)) > >>> continue; > >>> > >>> + GEM_BUG_ON(rq->engine != engine); > >>> if (i915_request_in_priority_queue(rq)) > >>> list_move_tail(&rq->sched.link, plist); > >>> > >>> /* Defer (tasklet) submission until after all updates. */ > >>> kick_submission(engine, rq, prio); > >>> - } > >>> + } while ((rq = stack_pop(rq, &pos))); > >>> } > >>> > >>> void i915_request_set_priority(struct i915_request *rq, int prio) > >>> @@ -444,7 +455,6 @@ void i915_sched_node_init(struct i915_sched_node *node) > >>> INIT_LIST_HEAD(&node->signalers_list); > >>> INIT_LIST_HEAD(&node->waiters_list); > >>> INIT_LIST_HEAD(&node->link); > >>> - INIT_LIST_HEAD(&node->dfs); > >>> > >>> node->ipi_link = NULL; > >>> > >>> > >> > >> Pen and paper was needed here but it looks good. > > > > If you highlight the areas that need more commentary, I guess > > a theory-of-operation for stack_push/stack_pop? > > At some point I wanted to suggest you change dfs.list_head abuse to > explicit rq and list head pointer to better represent how there are two > pieces of information tracked in there. Ok. While writing it I thought some places continued to use it as a struct list_head, but it appears that this is the only user. I'll give it a whirl. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx