On Wed, 2024-08-28 at 17:25 +0200, Christian König wrote: > Am 28.08.24 um 16:05 schrieb Thomas Hellström: > > On Wed, 2024-08-28 at 14:20 +0200, Christian König wrote: > > > Am 27.08.24 um 19:53 schrieb Daniel Vetter: > > > > On Tue, Aug 27, 2024 at 06:52:13PM +0200, Daniel Vetter wrote: > > > > > On Thu, Aug 22, 2024 at 03:19:29PM +0200, Christian König > > > > > wrote: > > > > > > Completely agree that this is complicated, but I still > > > > > > don't > > > > > > see the need > > > > > > for it. > > > > > > > > > > > > Drivers just need to use pm_runtime_get_if_in_use() inside > > > > > > the > > > > > > shrinker and > > > > > > postpone all hw activity until resume. > > > > > Not good enough, at least long term I think. Also postponing > > > > > hw > > > > > activity > > > > > to resume doesn't solve the deadlock issue, if you still need > > > > > to > > > > > grab ttm > > > > > locks on resume. > > > > Pondered this specific aspect some more, and I think you still > > > > have > > > > a race > > > > here (even if you avoid the deadlock): If the condiditional > > > > rpm_get > > > > call > > > > fails there's no guarantee that the device will suspend/resume > > > > and > > > > clean > > > > up the GART mapping. > > > Well I think we have a major disconnect here. When the device is > > > powered > > > down there is no GART mapping to clean up any more. > > > > > > In other words GART is a table in local memory (VRAM) when the > > > device > > > is > > > powered down this table is completely destroyed. Any BO which was > > > mapped > > > inside this table is now not mapped any more. > > > > > > So when the shrinker wants to evict a BO which is marked as > > > mapped to > > > GART and the device is powered down we just skip the GART > > > unmapping > > > part > > > because that has already implicitly happened during power down. > > > > > > Before mapping any BO into the GART again we power the GPU up > > > through > > > the runtime PM calls. And while powering it up again the GART is > > > restored. > > I think you're forgetting the main Xe use-case of Lunar-lake > > compression metadata. I'ts retained by the device during D3hot, but > > cannot, at that time, be accessed for shrinking. > > Yeah, that is really something we don't have an equivalent for on AMD > GPUs. > > When the ASIC is powered down VRAM is basically dead as well because > it > won't get refreshed any more. > > > And copying it all out "Just in case" when transitioning to D3hot > > just > > isn't a viable solution. > > I would say that this is solvable with a hierarchy of power > management > functionality. > > E.g. the runtime PM interface works the same for you as it does for > amdgpu with evicting TTM BOs etc.... > > Then separate from runtime PM you have a reference count for the > accessibility of compressed metadata. And while shrinking you only > resume this specific part. Well this looks like exactly what have with current hardware, or rather *current* hardware that needs to take the bo lock during runtime pm doesn't need this type of metadata access, which is currently Lunar- lake only. So from a runtime pm deadlock point-of-view we're safe. But then we need to blit out the metadata and we get a fence to wait for, which we've been told to avoid in direct reclaim but rather is preferred from kswapd. And I must admit I still don't get what the problem is with allowing a driver LRU loop? The fact that some AMD eviction_valuable() implementations made bad decisions by far doesn't weigh up for the loss of flexibility IMO. /Thomas > > Christian. > > > > > /Thomas > > >