On Thu, 2018-08-02 at 15:54 +0100, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2018-08-02 15:47:37) > > > > On 28/07/2018 17:46, Chris Wilson wrote: > > > Here we bump the busyspin for the current request to avoid > > > sleeping and > > > the cost of both idling and downclocking the CPU. > > > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > Cc: Sagar Kamble <sagar.a.kamble@xxxxxxxxx> > > > Cc: Eero Tamminen <eero.t.tamminen@xxxxxxxxx> > > > Cc: Francisco Jerez <currojerez@xxxxxxxxxx> > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > > Cc: Ben Widawsky <ben@xxxxxxxxxxxx> > > > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > > > Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx> > > > --- > > > drivers/gpu/drm/i915/Kconfig.profile | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/i915/Kconfig.profile > > > b/drivers/gpu/drm/i915/Kconfig.profile > > > index de394dea4a14..54e0ba4051e7 100644 > > > --- a/drivers/gpu/drm/i915/Kconfig.profile > > > +++ b/drivers/gpu/drm/i915/Kconfig.profile > > > @@ -1,6 +1,6 @@ > > > config DRM_I915_SPIN_REQUEST_IRQ > > > int > > > - default 5 # microseconds > > > + default 250 # microseconds > > > help > > > Before sleeping waiting for a request (GPU operation) to > > > complete, > > > we may spend some time polling for its completion. As the > > > IRQ may > > > > > > > But the histograms from previous patch do not suggest 250us busy > > spin > > gets us into interesting territory. And we have to know the > > distribution > > between IRQ and CS spins. > > This is for a different problem than presented in those histograms. > This > is to hide the reclocking that occurred if we sleep. > > Previous justification for the initial spin was to try and hide the > irq > setup costs, this is to try and hide the sleep costs. > > The purpose of this patch was just to dip the toe into the water of > what > is possible and solicit feedback since we are still waiting for the > panacea patches to improve cpufreq. (Why it had such a lackluster > justification.) This patch somewhat resembles with the issue we just met in userspace for the opencl and media. Long story short there was CPU% regression between 2 opencl releases root caused to the following. OpenCL rewrote the code how they wait for the task completion. Essentially they have an optimistic spin implemented on the userspace side and logic their is quite complex: they commit to different duration of the spin depending on the scenario. And they had some pretty long spin, like 10ms. This is a killer thing if opencl is coupled with media workloads, especially when media is dominant. That's because usual execution time of media batches is pretty long, few or dozen milliseconds is quite typical. As a result opencl workload is really being submitted and executed with significant delay, thus optimistic spin gives up and it comes with the CPU% cost. This story was discussed here: https://github.com/intel/comp ute-runtime/commit/d53e1c3979f6d822f55645bfcd753be1658f53ca#comments. I was actually surprised why they don't rely on i915 wait function. They responded in was: "I would gladly always use bo_wait ioctl, but it has big overhead for short workloads." Thus the question: do we have an opportunity to improve? Can we have some dynamic way to configure spinning for the particular requests/contexts? /if I understand correctly current patch it introduces kernel config-level setting rather than some dynamic setting/ > -Chris > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx