Re: [RFC PATCH 05/28] drm/gpusvm: Add support for GPU Shared Virtual Memory

Daniel Vetter <daniel.vetter@xxxxxxxx> · Mon, 2 Sep 2024 13:40:23 +0200

On Thu, Aug 29, 2024 at 04:49:15PM +0000, Matthew Brost wrote:
> On Wed, Aug 28, 2024 at 08:50:02PM +0200, Daniel Vetter wrote:
> > On Tue, Aug 27, 2024 at 07:48:38PM -0700, Matthew Brost wrote:
> > > +int drm_gpusvm_migrate_to_sram(struct drm_gpusvm *gpusvm,
> > > +			       struct drm_gpusvm_range *range,
> > > +			       const struct drm_gpusvm_ctx *ctx)
> > > +{
> > > +	u64 start = range->va.start, end = range->va.end;
> > > +	struct mm_struct *mm = gpusvm->mm;
> > > +	struct vm_area_struct *vas;
> > > +	int err;
> > > +	bool retry = false;
> > > +
> > > +	if (!ctx->mmap_locked) {
> > > +		if (!mmget_not_zero(mm)) {
> > > +			err = -EFAULT;
> > > +			goto err_out;
> > > +		}
> > > +		if (ctx->trylock_mmap) {
> > > +			if (!mmap_read_trylock(mm))  {
> > > +				err = drm_gpusvm_evict_to_sram(gpusvm, range);
> > > +				goto err_mmput;
> > > +			}
> > > +		} else {
> > > +			mmap_read_lock(mm);
> > > +		}
> > > +	}
> > > +
> > > +	mmap_assert_locked(mm);
> > > +
> > > +	/*
> > > +	 * Loop required to find all VMA area structs for the corner case when
> > > +	 * VRAM backing has been partially unmapped from MM's address space.
> > > +	 */
> > > +again:
> > > +	vas = find_vma(mm, start);
> > 
> > So a hiliarous case that amdkfd gets a bit better but still not entirely
> > is that the original vma might entirely gone. Even when you can still get
> > at the mm of that process. This happens with cow (or shared too I think)
> > mappings in forked child processes, or also if you play fun mremap games.
> > 
> > I think that outside of the ->migrate_to_ram callback migration/eviction
> > to sram cannot assume there's any reasonable vma around and has to
> > unconditionally go with the drm_gpusvm_evict_to_sram path.
> > 
> 
> See my response here [1]. Let me drop the whole trylock thing and
> convert to an 'evict' flag which calls drm_gpusvm_evict_to_sram in
> places where Xe needs to evict VRAM. Or maybe just export that function
> and call it directly. That way the only place the VMA is looked up for
> SRAM -> VRAM is upon CPU page fault.

Yeah I think a dedicated path for migrate_to_ram hook that goes directly
into your evict_to_sram path is the design-clean approach here imo.

> [1] https://patchwork.freedesktop.org/patch/610955/?series=137870&rev=1#comment_1111164
> 
> > Also in the migrate_to_ram case the vma is essentially nothing else that
> > informational about which ranges we might need if we prefault a bit (in
> > case the child changed the vma compared to the original one). So it's good
> > to as parameter for migrate_vma_setup, but absolutely nothing else.
> > 
> > amdkfd almost gets this right by being entirely based on their svm_range
> > structures, except they still have the lingering check that the orignal mm
> > is still alive. Of course you cannot ever use that memory on the gpu
> > anymore, but the child process could get very pissed if their memory is
> > suddenly gone. Also the eviction code has the same issue as yours and
> > limits itself to vma that still exist in the original mm, leaving anything
> > that's orphaned in children or remaps stuck in vram. At least that's my
> > understanding, I might very well be wrong.
> > 
> > So probably want a bunch of these testcases too to make sure that all
> > works, and we're not stuck with memory allocations in vram that we can't
> > move out.
> 
> When writing some additional test cases, let me add hooks in my IGTs to
> be able to verify we are not orphaning VRAM too.

So maybe apply caution, I'm honestly not sure whether core mm makes any
guarantees about not orphaning stuff, at least for a little bit.

Over the w/e my brain tossed me the "so are we sure we can tear down our
zone_device data, the page array specifically" brain teaser. And I think
the answer is that we have to wait until all page references disappear,
which might take a long time. Core mm makes no guarantee about elevated
page references disappearing in a timely manner, at least as far as I
know. Which is also why migration is a best effort thing only.

Cheers, Sima
-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch