On Thu, Aug 29, 2024 at 10:12:53PM +0000, Matthew Brost wrote: > On Thu, Aug 29, 2024 at 01:02:54PM +0200, Daniel Vetter wrote: > > On Thu, Aug 29, 2024 at 11:53:58AM +0200, Thomas Hellström wrote: > > > But as Sima pointed out in private communication, exhaustive eviction > > > is not really needed for faulting to make (crawling) progress. > > > Watermarks and VRAM trylock shrinking should suffice, since we're > > > strictly only required to service a single gpu page granule at a time. > > > > > > However, ordinary bo-based jobs would still like to be able to > > > completely evict SVM vram. Whether that is important enough to strive > > > for is ofc up for discussion. > > > > My take is that you don't win anything for exhaustive eviction by having > > the dma_resv somewhere in there for svm allocations. Roughly for split lru > > world, where svm ignores bo/dma_resv: > > > > When evicting vram from the ttm side we'll fairly switch between selecting > > bo and throwing out svm pages. With drm_exec/ww_acquire_ctx selecting bo > > will eventually succeed in vacuuming up everything (with a few retries > > perhaps, if we're not yet at the head of the ww ticket queue). > > > > svm pages we need to try to evict anyway - there's no guarantee, becaue > > the core mm might be holding temporary page references (which block > > Yea, but think you can could kill the app then - not suggesting we > should but could. To me this is akin to a CPU fault and not being able > to migrate the device pages - the migration layer doc says when this > happens kick this to user space and segfault the app. > > My last patch in the series adds some asserts to see if this ever > happens, thus far never. If it occurs we could gracefully handle it by > aborting the migration I guess... I think the user really needs to > something a bit crazy to trigger this condition - I don't think the core > randomly grabs refs to device pages but could be wrong. > > > migration) or have the page locked (which also block the migration). But > > as long as those two steps succeed, we'll win and get the pages. There > > might be some thrashing against concurrent svm faults stealing them again, > > but they have a disadvantage since they can't steal dma_resv_locked bo. > > And if it's still too much we can stall them in the page allocator. > > > > So it's not entirely reliable, but should be close enough. > > > > Now for bo based svm the picture isn't any different, because holding > > dma_resv is not actually enough to migrate svm mappings. We still need to > > hope there's no temporary page references around, and we still need to > > succeed at locking the page. And the migration code only does trylocks, > > because that's it's deadlock prevent algorithm if different migrations > > needing the same set of pages, but acquiring them in a different order. So > > we win nothing. > > Ok, maybe my statement above is false... > > Wouldn't be the only time this falls is if another migration is in > flight (e.g. CPU fault) and they race? Then the eviction will naturally > happen via refcount being dropped from the other migration. I guess I > likely need to update my eviction code to not free the TTM resource if > all pages are not migrated. > Also if we add something like AMD's range->migration_mutex pretty sure this race goes away and we left with 'the user has done something not smart, and could convinciblely kill the app'. Matt > > > > Worse, if dma_resv does actually hold up svm migration and reclaim, then > > we potentially deadlock because that lock is for a bigger range than > > individual pages (or folios). And the core mm assumes that it can get out > > of a deadlock bind by (at least stochastically) eventually succeeding in > > acquiring/locking down a single page. > > > > This means we cannot use dma_resv tricks to give the ttm world an > > advantage in exhaustive eviction against concurrent svm faults. Or at > > least not more than we can do without by just stalling svm faults that > > need to allocate gpu memory (but that must happen without holding locks or > > we're busted). > > > > I'm a little lost here on the deadlock case. Do you mean when we try to > evict SVM BO we trigger reclaim by allocating system pages and can > deadlock? Doesn't TTM already have this dependency when evicting non-SVM > BOs? > > > So the only benefit I'm seeing is the unified lru, which I'm not sure is > > worth it. There's also a bit a lru design tension here, because for the bo > > Well also not rewriting the world... > > Matt > > > world we want objects that are locked to stay on the lru, so that the > > competing processes can figure out who has the winning ww ticket. The core > > mm design otoh does isolate pages and remove them from the lru when > > they're acquired, so that they don't gunk up other processes from trying > > to make forward progress and are better hidden. Which reduces temporary > > page references (from lru walk) preventing migration and stuff like that. > > > > Cheers, Sima > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch