Re: [PATCH 1/3] drm/i915: Allow kswapd to pause the device whilst reaping

Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> · Fri, 02 Jun 2017 15:38:34 +0300

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes:

> Quoting Mika Kuoppala (2017-06-02 13:02:57)
>> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes:
>> 
>> > In commit 5763ff04dc4e ("drm/i915: Avoid GPU stalls from kswapd") we
>> > stopped direct reclaim and kswapd from triggering GPU/client stalls
>> > whilst running (by restricting the objects they could reap to be idle).
>> >
>> > However with abusive GPU usage, it becomes quite easy to starve kswapd
>> > of memory and prevent it from making forward progress towards obtaining
>> > enough free memory (thus driving the system closer to swap exhaustion).
>> > Relax the previous restriction to allow kswapd (but not direct reclaim)
>> > to stall the device whilst reaping purgeable pages.
>> >
>> > v2: Also acquire the rpm wakelock to allow kswapd to unbind buffers.
>> >
>> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
>> > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
>> > ---
>> >  drivers/gpu/drm/i915/i915_gem_shrinker.c | 9 +++++++++
>> >  1 file changed, 9 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
>> > index 0fd2b58ce475..58f27369183c 100644
>> > --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
>> > +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
>> > @@ -332,6 +332,15 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
>> >                                        sc->nr_to_scan - freed,
>> >                                        I915_SHRINK_BOUND |
>> >                                        I915_SHRINK_UNBOUND);
>> > +     if (freed < sc->nr_to_scan && current_is_kswapd()) {
>> > +             intel_runtime_pm_get(dev_priv);
>> 
>> We take extra ref to force device wake and thus force bound objects out?
>
> Yes. The shrinker skips the unbind phase if it can't acquire the device
> wakeref, so we ensure we enter the shrinker with it held.
>  
>> > +             freed += i915_gem_shrink(dev_priv,
>> > +                                      sc->nr_to_scan - freed,
>> > +                                      I915_SHRINK_ACTIVE |
>> > +                                      I915_SHRINK_BOUND |
>> > +                                      I915_SHRINK_UNBOUND);
>> 
>> Looking at the shrink code, I am pondering how the stall will happen?
>> 
>> There are other callpaths that force gpu idle before kicking out
>> objects, but for this callpath it seems that we kick out
>> objects that might be still currently accessed by the gpu.
>
> By unbinding an active object, we stall.

I see that now.

Reviewed-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx>

> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx