On Thu, Aug 29, 2024 at 04:49:15PM +0000, Matthew Brost wrote: > On Wed, Aug 28, 2024 at 08:50:02PM +0200, Daniel Vetter wrote: > > On Tue, Aug 27, 2024 at 07:48:38PM -0700, Matthew Brost wrote: > > > +int drm_gpusvm_migrate_to_sram(struct drm_gpusvm *gpusvm, > > > + struct drm_gpusvm_range *range, > > > + const struct drm_gpusvm_ctx *ctx) > > > +{ > > > + u64 start = range->va.start, end = range->va.end; > > > + struct mm_struct *mm = gpusvm->mm; > > > + struct vm_area_struct *vas; > > > + int err; > > > + bool retry = false; > > > + > > > + if (!ctx->mmap_locked) { > > > + if (!mmget_not_zero(mm)) { > > > + err = -EFAULT; > > > + goto err_out; > > > + } > > > + if (ctx->trylock_mmap) { > > > + if (!mmap_read_trylock(mm)) { > > > + err = drm_gpusvm_evict_to_sram(gpusvm, range); > > > + goto err_mmput; > > > + } > > > + } else { > > > + mmap_read_lock(mm); > > > + } > > > + } > > > + > > > + mmap_assert_locked(mm); > > > + > > > + /* > > > + * Loop required to find all VMA area structs for the corner case when > > > + * VRAM backing has been partially unmapped from MM's address space. > > > + */ > > > +again: > > > + vas = find_vma(mm, start); > > > > So a hiliarous case that amdkfd gets a bit better but still not entirely > > is that the original vma might entirely gone. Even when you can still get > > at the mm of that process. This happens with cow (or shared too I think) > > mappings in forked child processes, or also if you play fun mremap games. > > > > I think that outside of the ->migrate_to_ram callback migration/eviction > > to sram cannot assume there's any reasonable vma around and has to > > unconditionally go with the drm_gpusvm_evict_to_sram path. > > > > See my response here [1]. Let me drop the whole trylock thing and > convert to an 'evict' flag which calls drm_gpusvm_evict_to_sram in > places where Xe needs to evict VRAM. Or maybe just export that function > and call it directly. That way the only place the VMA is looked up for > SRAM -> VRAM is upon CPU page fault. Yeah I think a dedicated path for migrate_to_ram hook that goes directly into your evict_to_sram path is the design-clean approach here imo. > [1] https://patchwork.freedesktop.org/patch/610955/?series=137870&rev=1#comment_1111164 > > > Also in the migrate_to_ram case the vma is essentially nothing else that > > informational about which ranges we might need if we prefault a bit (in > > case the child changed the vma compared to the original one). So it's good > > to as parameter for migrate_vma_setup, but absolutely nothing else. > > > > amdkfd almost gets this right by being entirely based on their svm_range > > structures, except they still have the lingering check that the orignal mm > > is still alive. Of course you cannot ever use that memory on the gpu > > anymore, but the child process could get very pissed if their memory is > > suddenly gone. Also the eviction code has the same issue as yours and > > limits itself to vma that still exist in the original mm, leaving anything > > that's orphaned in children or remaps stuck in vram. At least that's my > > understanding, I might very well be wrong. > > > > So probably want a bunch of these testcases too to make sure that all > > works, and we're not stuck with memory allocations in vram that we can't > > move out. > > When writing some additional test cases, let me add hooks in my IGTs to > be able to verify we are not orphaning VRAM too. So maybe apply caution, I'm honestly not sure whether core mm makes any guarantees about not orphaning stuff, at least for a little bit. Over the w/e my brain tossed me the "so are we sure we can tear down our zone_device data, the page array specifically" brain teaser. And I think the answer is that we have to wait until all page references disappear, which might take a long time. Core mm makes no guarantee about elevated page references disappearing in a timely manner, at least as far as I know. Which is also why migration is a best effort thing only. Cheers, Sima -- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch