Re: [PATCH 03/11] drm/i915: put runtime PM only when we actually release force_wake

Paulo Zanoni <przanoni@xxxxxxxxx> · Fri, 21 Feb 2014 17:08:50 -0300

2014-02-21 14:34 GMT-03:00 Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>:
> On Fri, 21 Feb 2014 13:52:20 -0300
> Paulo Zanoni <przanoni@xxxxxxxxx> wrote:
>
>> From: Paulo Zanoni <paulo.r.zanoni@xxxxxxxxx>
>>
>> When we call gen6_gt_force_wake_put we don't actually put force_wake,
>> we just schedule gen6_force_wake_work through mod_delayed_work, and
>> that will eventually release force_wake.
>>
>> The problem is that we call intel_runtime_pm_put directly at
>> gen6_gt_force_wake_put, so most of the times we put our runtime PM
>> reference before the delayed work happens, so we may runtime suspend
>> while force_wake is still supposed to be enabled if the graphics
>> autosuspend_delay_ms is too small.
>>
>> Now the nice thing about the current code is that after it triggers
>> the delayed work function it gets a refcount, and it only triggers the
>> delayed work function if refcount is zero. This guarantees that when
>> we schedule the funciton, it will run before we try to schedule it
>> again, which simplifies the problem and allows for the current
>> solution to work properly (hopefully!).
>>
>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@xxxxxxxxx>
>> ---
>>  drivers/gpu/drm/i915/intel_uncore.c | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
>> index c628414..1f7226f 100644
>> --- a/drivers/gpu/drm/i915/intel_uncore.c
>> +++ b/drivers/gpu/drm/i915/intel_uncore.c
>> @@ -299,6 +299,8 @@ static void gen6_force_wake_work(struct work_struct *work)
>>       if (--dev_priv->uncore.forcewake_count == 0)
>>               dev_priv->uncore.funcs.force_wake_put(dev_priv, FORCEWAKE_ALL);
>>       spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
>> +
>> +     intel_runtime_pm_put(dev_priv);
>>  }
>>
>>  static void intel_uncore_forcewake_reset(struct drm_device *dev)
>> @@ -393,6 +395,7 @@ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv, int fw_engine)
>>  void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine)
>>  {
>>       unsigned long irqflags;
>> +     bool delayed = false;
>>
>>       if (!dev_priv->uncore.funcs.force_wake_put)
>>               return;
>> @@ -405,13 +408,15 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine)
>>       spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
>>       if (--dev_priv->uncore.forcewake_count == 0) {
>>               dev_priv->uncore.forcewake_count++;
>> +             delayed = true;
>>               mod_delayed_work(dev_priv->wq,
>>                                &dev_priv->uncore.force_wake_work,
>>                                1);
>>       }
>>       spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
>>
>> -     intel_runtime_pm_put(dev_priv);
>> +     if (!delayed)
>> +             intel_runtime_pm_put(dev_priv);
>>  }
>>
>>  /* We give fast paths for the really cool registers */
>
> Do we need this for the VLV path too?

Yeah, my patch is wrong for VLV due to that "return". I'll send a new version.

By the way, why doesn't VLV use the delayed work queue? I would assume
the work queue is there to improve performance somehow, so it could be
a good idea to use it... And maybe try to avoid special-casing VLV
would be good too :)

>
> It's a little confusing that we do this delayed thing, incrementing the
> count and then decrementing again in the work queue, but what you have
> looks correct for both cases.
>
> So with the VLV thing addressed:
>
> Reviewed-by: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>
>
> --
> Jesse Barnes, Intel Open Source Technology Center

-- 
Paulo Zanoni
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx