On Tue, Aug 27, 2024 at 06:52:13PM +0200, Daniel Vetter wrote: > On Thu, Aug 22, 2024 at 03:19:29PM +0200, Christian König wrote: > > Completely agree that this is complicated, but I still don't see the need > > for it. > > > > Drivers just need to use pm_runtime_get_if_in_use() inside the shrinker and > > postpone all hw activity until resume. > > Not good enough, at least long term I think. Also postponing hw activity > to resume doesn't solve the deadlock issue, if you still need to grab ttm > locks on resume. Pondered this specific aspect some more, and I think you still have a race here (even if you avoid the deadlock): If the condiditional rpm_get call fails there's no guarantee that the device will suspend/resume and clean up the GART mapping. The race gets a bit smaller if you use pm_runtime_get_if_active(), but even then you might catch it right when resume almost finished. That means we'll have ttm bo hanging around with GART allocations/mappings which aren't actually valid anymore (since they might escape the cleanup upon resume due to the race). That doesn't feel like a solid design either. -Sima -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch