Hi Chris, Are you still tuning the patch or it is ready to go? For us this is critical and one of the most important patches over the past few years which gives benefits to the whole media stack in the server segment. Definitely would like to see it upstreamed ASAP. Anything outstanding you want us to experiment with? Dmitry. -----Original Message----- From: Chris Wilson [mailto:chris@xxxxxxxxxxxxxxxxxx] Sent: Wednesday, November 4, 2015 5:48 PM To: Gong, Zhipeng Cc: intel-gfx@xxxxxxxxxxxxxxxxxxxxx; Rogozhkin, Dmitry V Subject: Re: [PATCH] RFC drm/i915: Slaughter the thundering i915_wait_request herd On Wed, Nov 04, 2015 at 01:20:33PM +0000, Gong, Zhipeng wrote: > > > > -----Original Message----- > > From: Chris Wilson [mailto:chris@xxxxxxxxxxxxxxxxxx] > > Sent: Wednesday, November 04, 2015 5:54 PM On Wed, Nov 04, 2015 at > > 06:19:33AM +0000, Gong, Zhipeng wrote: > > > > From: Chris Wilson [mailto:chris@xxxxxxxxxxxxxxxxxx] On Tue, Nov > > > > 03, > > > > 2015 at 01:31:22PM +0000, Gong, Zhipeng wrote: > > > > > > > > > > > From: Chris Wilson [mailto:chris@xxxxxxxxxxxxxxxxxx] > > > > > > > > > > > > Do you also have a relative perf statistics like op/s we can > > > > > > compare to make sure we aren't just stalling the whole system? > > > > > > > > > > > Could you please provide the commands about how to check it? > > > > > > > > I was presuming your workload has some measure of > > efficiency/throughput? > > > > It is one thing to say we are using 10% less CPU (per second), > > > > but the task is running 2x as long! > > > We use execute time as a measurement, the patch affects the > > > execution time for our cases slightly. > > > > > > Exec time(s) | w/o patch | w/patch > > > ----------------------------------------------- > > > BDW async 1 | 65.00 | 65.25 > > > BDW async 5 | 68.50 | 66.42 > > > > That's reassuring. > > > > > > > > > > > > How much cpu time is left in the i915_wait_request branch? i.e. > > > > > > how close to the limit are we with chasing this path? > > > > > Could you please provide the commands here either? :) > > > > > > > > Check the perf callgraph. > > > > > > Now the most of time is in io_schedule_timeout __i915_wait_request > > > |--64.04%-- io_schedule_timeout > > > |--22.04%-- intel_engine_add_wakeup > > > |--3.13%-- prepare_to_wait > > > |--2.99%-- gen6_rps_boost > > > |-... > > > > No more busywaits, and most of the time is spent kicking the next > > process or doing the insertion sort into the waiting rbtree. > > > > What's the ratio now of __i915_wait_request to the next hot function? > > And who are the chief callers of __i915_wait_request? > > -Chris > Please check the attachments for the details, I post a piece of it here: > |--17.89%-- i915_gem_object_sync > |--73.19%-- __i915_wait_request > |--12.60%-- i915_gem_object_retire_request Interesting. Most of the time is spent shuffling requests around in the execbuffer rather than doing useful work. I've been working on moving that work around, but even then we are likely to be spending our time instantiating all those new objects. As far as trimming the CPU time from __i915_wait_request() that looks about as far as we can go. If you have some free cycles on those machines, I would very much appreciate seeing the same callgraphs from a http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=nightly&id=134211e33719ef698f9bd51b72ad2fc434cb51f9 kernel Thanks, -Chris -- Chris Wilson, Intel Open Source Technology Centre -------------------------------------------------------------------- Joint Stock Company Intel A/O Registered legal address: Krylatsky Hills Business Park, 17 Krylatskaya Str., Bldg 4, Moscow 121614, Russian Federation This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx