Re: [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2017-11-30 at 13:19 +0200, Imre Deak wrote:
> On Thu, Nov 30, 2017 at 09:45:25AM +0000, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-11-30 09:18:20)
> > > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > > 
> > > It seems that the DMC likes to transition between the DC states a lot when
> > > there are no connected displays (no active power domains) during command
> > > submission.
> > > 
> > > This activity on DC states has a negative impact on the performance of the
> > > chip with huge latencies observed in the interrupt handlers and elsewhere.
> > > Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> > > eight.
> > > 
> > > Work around it by introducing a new power domain named,
> > > POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> > > held for the duration of command submission activity.
> > > 
> > > v2:
> > >  * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> > >  * Protect macro body with braces. (Jani Nikula)
> > > 
> > > v3:
> > >  * Add dedicated power domain for clarity. (Chris, Imre)
> > >  * Commit message and comment text updates.
> > >  * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> > >    firmware release.
> > > 
> > > v4:
> > >  * Power domain should be inner to device runtime pm. (Chris)
> > >  * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
> > >  * Handle async DMC loading by moving the GT_IRQ power domain logic into
> > >    intel_runtime_pm. (Daniel, Chris)
> > >  * Include small core GEN9 as well. (Imre)
> > > 
> > > v5
> > >  * Special handling for async DMC load is not needed since on failure the
> > >    power domain reference is kept permanently taken. (Imre)
> > > 
> > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> > > Testcase: igt/gem_exec_nop/headless
> > > Cc: Imre Deak <imre.deak@xxxxxxxxx>
> > > Acked-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> (v2)
> > > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@xxxxxxxxx>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.h         |  4 ++++
> > >  drivers/gpu/drm/i915/i915_gem.c         |  3 +++
> > >  drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
> > >  drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++++
> > >  4 files changed, 35 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index bddd65839f60..59cf11dd5d3b 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> > >         POWER_DOMAIN_AUX_D,
> > >         POWER_DOMAIN_GMBUS,
> > >         POWER_DOMAIN_MODESET,
> > > +       POWER_DOMAIN_GT_IRQ,
> > >         POWER_DOMAIN_INIT,
> > >  
> > >         POWER_DOMAIN_NUM,
> > > @@ -3285,6 +3286,9 @@ intel_info(const struct drm_i915_private *dev_priv)
> > >  #define GT_FREQUENCY_MULTIPLIER 50
> > >  #define GEN9_FREQ_SCALER 3
> > >  
> > > +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> > > +       (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
> 
> Nitpick: could be just !IS_SKYLAKE(), but works in the above way too.
> For all other platforms the GT_IRQ domain won't be mapped making
> display_power_get/put on those just a domain ref inc/dec, otherwise a
> nop.

Folks, is my understanding correct that once SKL DMC will be merged we
just remove !IS_SKYLAKE from the above check to get the performance fix
for SKL and nothing more changes in the patch? I am asking because there
is a need to have SKL patch on the plate already to verify the fix
sooner rather than later. From this perspective, can we have one more
patch in the series dedicated to fix SKL right now? After all, SKL DMC
is available in drm-firmware already, those who need can try it.

> 
> > > +
> > >  #include "i915_trace.h"
> > >  
> > >  static inline bool intel_vtd_active(void)
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 354b0546a191..c6067cba1dca 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -3381,6 +3381,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
> > >  
> > >         if (INTEL_GEN(dev_priv) >= 6)
> > >                 gen6_rps_idle(dev_priv);
> > > +
> > > +       intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> > > +
> > >         intel_runtime_pm_put(dev_priv);
> > >  out_unlock:
> > >         mutex_unlock(&dev_priv->drm.struct_mutex);
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > > index a90bdd26571f..c28a4ceb016d 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > > @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
> > >         GEM_BUG_ON(!i915->gt.active_requests);
> > >  
> > >         intel_runtime_pm_get_noresume(i915);
> > > +
> > > +       /*
> > > +        * It seems that the DMC likes to transition between the DC states a lot
> > > +        * when there are no connected displays (no active power domains) during
> > > +        * command submission.
> > > +        *
> > > +        * This activity has negative impact on the performance of the chip with
> > > +        * huge latencies observed in the interrupt handler and elsewhere.
> > > +        *
> > > +        * Work around it by grabbing a GT IRQ power domain whilst there is any
> > > +        * GT activity, preventing any DC state transitions.
> > > +        */
> > > +       intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> > > +
> > >         i915->gt.awake = true;
> > >  
> > >         intel_enable_gt_powersave(i915);
> > > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > index 8315499452dc..48ad0828ace7 100644
> > > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
> > >                 return "INIT";
> > >         case POWER_DOMAIN_MODESET:
> > >                 return "MODESET";
> > > +       case POWER_DOMAIN_GT_IRQ:
> > > +               return "GT_IRQ";
> > >         default:
> > >                 MISSING_CASE(domain);
> > >                 return "?";
> > > @@ -1471,6 +1473,9 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
> > >  {
> > >         struct i915_power_domains *power_domains = &dev_priv->power_domains;
> > >  
> > > +       if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
> > > +               return;
> > 
> > I was hoping that in the magic of the powerdomains, this would just
> > evaporate (or at least find its own place within the powerwell enabling).
> 
> Yep, that would be "normally" the case. But since we have only a single
> set of domains->powerwells mapping for all GEN9_BC we can't isolate SKL
> that way w/o duplicating the mapping for SKL. Although all GEN9_BC are
> the same from power well handling POV for some reason there are still
> separate DMC firmwares for KBL/CFL and SKL. The corruption problem is
> only fixed in the KBL/CFL firmware the SKL one still to be added to
> linux-firmware.git .. At that point we can also remove these checks.
> 
> > 
> > If Imre is happy with plonking this here,
> > Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > 
> > Or at least good enough for now and can be improved upon?
> 
> Yes, looks ok and can adjust for SKL later.
> 
> > -Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux