On Mon, Apr 24, 2017 at 12:07:47PM +0100, Chris Wilson wrote: > On Mon, Apr 24, 2017 at 11:28:32AM +0100, Tvrtko Ursulin wrote: > > > > On 19/04/2017 10:41, Chris Wilson wrote: > > Sounds attractive! What workloads show the benefit and how much? > > The default will show the best, since everything is priority 0 more or > less and so we reduce the rbtree search to a single lookup and list_add. > It's hard to measure the impact of the rbtree though. On the dequeue > side, the mmio access dominates. On the schedule side, if we have lots > of requests, the dfs dominates. > > I have an idea on how we might stress the rbtree in submit_request - but > still it requires long queues untypical of most workloads. Still tbd. I have something that does show a difference in that path (which is potentially in hardirq). Overal time is completely dominated by the reservation_object (ofc, we'll get back around to its scalability patches at some point). For a few thousand prio=0 requests inflight, the difference in execlists_submit_request() is about 6x, and for intel_lrc_irq_hander() is about 2x (just a factor that I sent a lot of coalesceable requests and so the reduction of rb_next to list_next). Completely synthetic testing, I would be worried if the rbtree was that tall in practice (request generation >> execution). The neat part of the split, I think is that make the resubmission of a gazzumped request easier - instead of writing a parallel rbtree sort, we just put the old request at the head of the plist. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx