Quoting Tvrtko Ursulin (2018-08-02 15:15:35) > > On 28/07/2018 17:46, Chris Wilson wrote: > > An interesting discussion regarding "hybrid interrupt polling" for NVMe > > came to the conclusion that the ideal busyspin before sleeping was half > > of the expected request latency (and better if it was already halfway > > through that request). This suggested that we too should look again at > > our tradeoff between spinning and waiting. Currently, our spin simply > > tries to hide the cost of enabling the interrupt, which is good to avoid > > penalising nop requests (i.e. test throughput) and not much else. > > Studying real world workloads suggests that a spin of upto 500us can > > dramatically boost performance, but the suggestion is that this is not > > from avoiding interrupt latency per-se, but from secondary effects of > > sleeping such as allowing the CPU reduce cstate and context switch away. > > > > In a truly hybrid interrupt polling scheme, we would aim to sleep until > > just before the request completed and then wake up in advance of the > > interrupt and do a quick poll to handle completion. This is tricky for > > ourselves at the moment as we are not recording request times, and since > > we allow preemption, our requests are not on as a nicely ordered > > timeline as IO. However, the idea is interesting, for it will certainly > > help us decide when busyspinning is worthwhile. > > > > v2: Expose the spin setting via Kconfig options for easier adjustment > > and testing. > > v3: Don't get caught sneaking in a change to the busyspin parameters. > > v4: Explain more about the "hybrid interrupt polling" scheme that we > > want to migrate towards. > > > > Suggested-by: Sagar Kamble <sagar.a.kamble@xxxxxxxxx> > > References: http://events.linuxfoundation.org/sites/events/files/slides/lemoal-nvme-polling-vault-2017-final_0.pdf > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Sagar Kamble <sagar.a.kamble@xxxxxxxxx> > > Cc: Eero Tamminen <eero.t.tamminen@xxxxxxxxx> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > Cc: Ben Widawsky <ben@xxxxxxxxxxxx> > > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > > Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx> > > Reviewed-by: Sagar Kamble <sagar.a.kamble@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/Kconfig | 6 +++++ > > drivers/gpu/drm/i915/Kconfig.profile | 26 +++++++++++++++++++ > > drivers/gpu/drm/i915/i915_request.c | 39 +++++++++++++++++++++++++--- > > 3 files changed, 67 insertions(+), 4 deletions(-) > > create mode 100644 drivers/gpu/drm/i915/Kconfig.profile > > > > diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig > > index 5c607f2c707b..387613f29cb0 100644 > > --- a/drivers/gpu/drm/i915/Kconfig > > +++ b/drivers/gpu/drm/i915/Kconfig > > @@ -132,3 +132,9 @@ depends on DRM_I915 > > depends on EXPERT > > source drivers/gpu/drm/i915/Kconfig.debug > > endmenu > > + > > +menu "drm/i915 Profile Guided Optimisation" > > Or something like a more generic drm/i915 Manual Tuning Options so it > sounds less like an automated thing where you can feed the results of > something straight in? Hah, good called. PGO was taken right out of the gcc manual, so yeah it's a bit misleading. > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > We could add PMU metrics to count hit/missed busy spins so profile > guided bit becomes more direct? Sensible. I wonder though how much we can do already with kprobes? Another one I think I'd like here is the DMA_LATENCY qos parameter. If you haven't seen https://patchwork.freedesktop.org/patch/241571/ that's worth throwing against media-bench. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx