Hi, Christian, Ping? Can i get an ack to proceed with this? Thanks, Thomas On Wed, 2024-09-18 at 14:57 +0200, Thomas Hellström wrote: > Sima, Christian > > I've updated the shrinker series now with a guarded for_each macro > instead: > > https://patchwork.freedesktop.org/patch/614514/?series=131815&rev=9 > > (Note I forgot to remove the export of the previous LRU walker). > > so the midlayer argument is now not an issue anymore. The cleanup.h > guard provides some additional protection against drivers exiting the > LRU loop early. > > So remaining is the question whether the driver is allowed to discard > a > suggested bo to shrink from TTM. > > Arguments for: > > 1) Not allowing that would require teaching TTM about purgeable > objects. > 2) Devices who need the blitter during shrinking would want to punt > runtime_pm_get() to kswapd to avoid sleeping direct reclaim. > 3) If those devices end up blitting (LNL) to be able to shrink, they > would want to punt waiting for the fence to signal to kswapd to avoid > waiting in direct reclaim. > 4) It looks like we need to resort to folio_trylock in the shmem > backup > backend when shrinking is called for gfp_t = GFP_NOFS. A failing > trylock will require a new bo. > > Arguments against: > None really. I thought the idea of demidlayering would be to allow > the > driver more freedom. > > So any feedback appreciated. If that is found acceptable we can > proceed > with reviewing this patch and also with the shrinker series. > > Thanks, > Thomas > > > On Mon, 2024-09-02 at 13:07 +0200, Daniel Vetter wrote: > > On Wed, Aug 28, 2024 at 02:20:34PM +0200, Christian König wrote: > > > Am 27.08.24 um 19:53 schrieb Daniel Vetter: > > > > On Tue, Aug 27, 2024 at 06:52:13PM +0200, Daniel Vetter wrote: > > > > > On Thu, Aug 22, 2024 at 03:19:29PM +0200, Christian König > > > > > wrote: > > > > > > Completely agree that this is complicated, but I still > > > > > > don't > > > > > > see the need > > > > > > for it. > > > > > > > > > > > > Drivers just need to use pm_runtime_get_if_in_use() inside > > > > > > the shrinker and > > > > > > postpone all hw activity until resume. > > > > > Not good enough, at least long term I think. Also postponing > > > > > hw > > > > > activity > > > > > to resume doesn't solve the deadlock issue, if you still need > > > > > to grab ttm > > > > > locks on resume. > > > > Pondered this specific aspect some more, and I think you still > > > > have a race > > > > here (even if you avoid the deadlock): If the condiditional > > > > rpm_get call > > > > fails there's no guarantee that the device will suspend/resume > > > > and clean > > > > up the GART mapping. > > > > > > Well I think we have a major disconnect here. When the device is > > > powered > > > down there is no GART mapping to clean up any more. > > > > > > In other words GART is a table in local memory (VRAM) when the > > > device is > > > powered down this table is completely destroyed. Any BO which was > > > mapped > > > inside this table is now not mapped any more. > > > > > > So when the shrinker wants to evict a BO which is marked as > > > mapped > > > to GART > > > and the device is powered down we just skip the GART unmapping > > > part > > > because > > > that has already implicitly happened during power down. > > > > > > Before mapping any BO into the GART again we power the GPU up > > > through the > > > runtime PM calls. And while powering it up again the GART is > > > restored. > > > > My point is that you can't tell whether the device will power down > > or > > not, > > you can only tell whether there's a chance it might be powering > > down > > and > > so you can't get at the rpm reference without deadlock issues. > > > > > > The race gets a bit smaller if you use > > > > pm_runtime_get_if_active(), but even then you might catch it > > > > right when > > > > resume almost finished. > > > > > > What race are you talking about? > > > > > > The worst thing which could happen is that we restore a GART > > > entry > > > which > > > isn't needed any more, but that is pretty much irrelevant since > > > we > > > only > > > clear them to avoid some hw bugs. > > > > The race I'm seeing is where you thought the GART entry is not > > issue, > > tossed an object, but the device didn't suspend, so might still use > > it. > > > > I guess if we're clearly separating the sw allocation of the TTM_TT > > with > > the physical entries in the GART that should all work, but feels a > > bit > > tricky. The race I've seen is essentially these two getting out of > > sync. > > > > So maybe it was me who's stuck. > > > > What I wonder is whether it works in practice, since on the restore > > side > > you need to get some locks to figure out which gart mappings exist > > and > > need restoring. And that's the same locks as the shrinker needs to > > figure > > out whether it might need to reap a gart mapping. > > > > Or do you just copy the gart entries over and restore them exactly > > as-is, > > so that there's no shared locks? > > > > > > That means we'll have ttm bo hanging around with GART > > > > allocations/mappings > > > > which aren't actually valid anymore (since they might escape > > > > the > > > > cleanup > > > > upon resume due to the race). That doesn't feel like a solid > > > > design > > > > either. > > > > > > I'm most likely missing something, but I'm really scratching my > > > head where > > > you see a problem here. > > > > I guess one issue is that at least traditionally, igfx drivers have > > nested > > runtime pm within dma_resv lock. And dgpu drivers the other way > > round. > > Which is a bit awkward if you're trying for common code. > > > > Cheers, Sima >