Re: [PATCH 4/7] drm/ttm: move LRU walk defines into new internal header

Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> · Wed, 21 Aug 2024 11:31:53 +0200

On Wed, 2024-08-21 at 10:57 +0200, Thomas Hellström wrote:
> On Wed, 2024-08-21 at 10:14 +0200, Christian König wrote:
> > Am 20.08.24 um 18:00 schrieb Thomas Hellström:
> > > > Or why exactly should shrinking fail?
> > > A common example would be not having runtime pm and the
> > > particular
> > > bo
> > > needs it to unbind, we want to try the next bo. Example: i915
> > > GGTT
> > > bound bos and Lunar Lake PL_TT bos.
> > 
> > WHAT? So you basically block shrinking BOs because you can't unbind
> > them 
> > because the device is powered down?
> > 
> > I would say that this is a serious NO-GO. It basically means that 
> > powered down devices can lock down system memory for undefined
> > amount
> > of 
> > time.
> > 
> > In other words an application can allocate memory, map it into GGTT
> > and 
> > then suspend or even get killed and we are not able to recover the 
> > memory because there is no activity on the GPU any more?
> > 
> > That really sounds like a bug in the driver design to me.
> 
> It's bad but it's not as bad as it sounds.
> 
> Problem is we can't wake up during direct reclaim IIRC due to runtime
> pm lockdep violations, but we can and do fire up a thread to wake the
> device and after the wakeup delay have subsequent shrink calls
> succeed,
> or punt to kswapd or the oom handler.
> I think that's an orthogonal discussion, though. There are other
> reasons shrinking might fail, like the bo being busy in direct
> reclaim
> (shouldn't wait for idle there but ok in kswapd), Other points of
> failure is ofc shmem radix tree allocations (not seen one yet,
> though)
> which might succeed with a smaller bo.
> (Not saying, though, that there isn't more to be done with the xe
> runtime pm implementation).
> 
> > 
> > > And again, all other drm bo shrinkers do this. We just want to do
> > > the
> > > same.
> > 
> > Do you have pointers?
> 
> As Sima said, this is complicated but not beyond comprehension: i915
> https://elixir.bootlin.com/linux/v6.11-rc4/source/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c#L317
> 
> msm:
> https://elixir.bootlin.com/linux/v6.11-rc4/source/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c#L317
> which uses
> https://elixir.bootlin.com/linux/v6.11-rc4/source/drivers/gpu/drm/drm_gem.c#L1426
> that is very similar in structure to what I implemented for TTM.
> 
> Panfrost: (although only purgeable objects AFAICT).
> https://elixir.bootlin.com/linux/v6.11-rc4/source/drivers/gpu/drm/drm_gem.c#L1426
> 
> 
> > 
> > > > > If we bump LRU we could end up with infinite loops.
> > > > > So IMO we need to be able to loop. I don't really care wether
> > > > > we do
> > > > > this as an explicit loop or whether we use the LRU walker,
> > > > > but
> > > > > I
> > > > > think
> > > > > from a maintainability point-of-view it is better to keep LRU
> > > > > walking
> > > > > in a single place.
> > > > > 
> > > > > If we return an unlocked object, we'd need to refcount and
> > > > > drop
> > > > > the
> > > > > lru
> > > > > lock, but maybe that's not a bad thing.
> > > > > 
> > > > > But what's the main drawback of exporting the existing
> > > > > helper.
> > > > Well that we re-creates exactly the mid-layer mess I worked so
> > > > hard
> > > > to
> > > > remove from TTM.
> > > It doesn't IMO. I agree the first attempt did. This affects only
> > > the
> > > LRU iteration itself and I'm even fine to get rid of the callback
> > > using
> > > a for_ macro.
> > 
> > Well, I mean using a for_each approach is objectively better than
> > having 
> > a callback and a state bag.
> > 
> > But the fundamental question is if drivers are allowed to reject 
> > shrinking. And I think the answer is no, they need to be designed
> > in
> > a 
> > way where shrinking is always possible.
> 
> Rejects can be out of our control, due to anticipated deadlocks, oom
> and deferring to kswapd.
> 
> > 
> > What can be that we can't get the necessary locks to evict and
> > object
> > (because it's about to be used etc...), but that are the per-
> > requisites 
> > TTM should be checking.
> > 
> > > > > In any case, I don't think TTM should enforce a different way
> > > > > of
> > > > > shrinking by the means of a severely restricted helper?
> > > > Well, as far as I can see that is exactly what TTM should do.
> > > > 
> > > > I mean the main advantage to make a common component is to
> > > > enforce
> > > > correct behavior.
> > > But if all other drivers don't agree this as correct behavior and
> > > instead want to keep behavior that is proven to work, that's a
> > > dead
> > > end.
> > 
> > Well no, even if all drivers agree to (for example) drop security 
> > precautions it's still not something acceptable.
> > 
> > And same thing here, if we block shrinking because drivers think
> > they
> > want their runtime PM implemented in a certain way then upstream
> > needs 
> > to block this and push back.
> > 
> > As far as I can see it's mandatory to have shrinkers not depend on 
> > runtime PM, cause otherwise you run into resources handling which 
> > depends on the well behavior of userspace and that in turn in
> > something 
> > we can't allow.
> 
> Please see the above explanation for runtime pm, and for the record I
> agree that enforcing disallowed or security violations is a
> completely
> valid thing.

Meant to say enforcing disallowing security violations..

Another thing that came to my mind is ofc swap_space. Depending on how
backup is done if we're out of swap space we can only shrink purgeable
WONTNEED objects (user-space pools typically), and TTM has no knowledge
of those (currently at least).

/Thomas

> 
> /Thomas
> 
> > 
> > Regards,
> > Christian.
> > 
> > > 
> > > /Thomas
> > > 
> > > 
> > > > Regards,
> > > > Christian.
> > > > 
> > > > > /Thomas
> > > > > 
>