Re: [PATCH v2 03/37] drm/i915/region: support basic eviction

"Tang, CQ" <cq.tang@xxxxxxxxx> · Thu, 15 Aug 2019 16:35:38 +0000

> -----Original Message-----
> From: Daniel Vetter [mailto:daniel@xxxxxxxx]
> Sent: Thursday, August 15, 2019 9:21 AM
> To: Tang, CQ <cq.tang@xxxxxxxxx>
> Cc: Matthew Auld <matthew.william.auld@xxxxxxxxx>; Intel Graphics
> Development <intel-gfx@xxxxxxxxxxxxxxxxxxxxx>; Auld, Matthew
> <matthew.auld@xxxxxxxxx>
> Subject: Re:  [PATCH v2 03/37] drm/i915/region: support basic
> eviction
> 
> On Thu, Aug 15, 2019 at 4:58 PM Tang, CQ <cq.tang@xxxxxxxxx> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Intel-gfx [mailto:intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx] On
> > > Behalf Of Daniel Vetter
> > > Sent: Thursday, August 15, 2019 7:27 AM
> > > To: Matthew Auld <matthew.william.auld@xxxxxxxxx>
> > > Cc: Intel Graphics Development <intel-gfx@xxxxxxxxxxxxxxxxxxxxx>;
> > > Auld, Matthew <matthew.auld@xxxxxxxxx>
> > > Subject: Re:  [PATCH v2 03/37] drm/i915/region: support
> > > basic eviction
> > >
> > > On Thu, Aug 15, 2019 at 12:48 PM Matthew Auld
> > > <matthew.william.auld@xxxxxxxxx> wrote:
> > > >
> > > > On Tue, 30 Jul 2019 at 17:26, Daniel Vetter <daniel@xxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Jun 27, 2019 at 09:55:59PM +0100, Matthew Auld wrote:
> > > > > > Support basic eviction for regions.
> > > > > >
> > > > > > Signed-off-by: Matthew Auld <matthew.auld@xxxxxxxxx>
> > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
> > > > > > Cc: Abdiel Janulgue <abdiel.janulgue@xxxxxxxxxxxxxxx>
> > > > >
> > > > > So from a very high level this looks like it was largely
> > > > > modelled after i915_gem_shrink.c and not i915_gem_evict.c (our
> > > > > other "make room, we're running out of stuff" code). Any specific
> reasons?
> > > >
> > > > IIRC I think it was originally based on the patches that exposed
> > > > stolen-memory to userspace from a few years ago.
> > > >
> > > > >
> > > > > I think i915_gem_evict is a lot closer match for what we want
> > > > > for vram (it started out to manage severely limitted GTT on
> > > > > gen2/3/4) after all. With the complication that we'll have to
> > > > > manage physical memory with multiple virtual mappings of it on
> > > > > top, so unfortunately we can't just reuse the locking patter
> > > > > Chris has come up with in his
> > > struct_mutex-removal branch.
> > > > > But at least conceptually it should be a lot closer.
> > > >
> > > > When you say make it more like i915_gem_evict, what does that mean?
> > > > Are you talking about the eviction roster stuff, or the
> > > > placement/locking of the eviction logic, with it being deep down
> > > > in get_pages?
> > >
> > > So there's kinda two aspects here that I meant.
> > >
> > > First is the high-level approach of the shrinker, which is a direct
> > > reflection of core mm low memory handling principles: Core mm just
> > > tries to equally shrink everyone when there's low memory, which is
> > > managed by watermarks, and a few other tricks. This is all only
> > > best-effort, and if multiple threads want a lot of memory at the
> > > same time then it's all going to fail with ENOMEM.
> > >
> > > On gpus otoh, and what we do in i915_gem_eviction.c for gtt (and
> > > very much needed with the tiny gtt for everything in gen2/3/4/5) is
> > > that when we run out of space, we stall, throw out everyone else,
> > > and have exclusive access to the entire gpu space. Then the next
> > > batchbuffer goes through the same dance. With this you guarantee
> > > that if you have a series of batchbuffers which all need e.g. 60% of
> > > lmem, they will all be able to execute. With the shrinker-style of
> > > low-memory handling eventually you're unlucky, both threads will only
> get up to 50%, fail with ENOSPC, and userspace crashes.
> > > Which is not good.
> > >
> > > The other bit is locking. Since we need to free pages from the
> > > shrinker there's tricky locking rules involved. Worse, we cannot
> > > back off from the shrinker down to e.g. the kmalloc or alloc_pages
> > > called that put us into reclaim. Which means the usual deadlock
> > > avoidance trick of having a slowpath where you first drop all the
> > > locks, then reacquire them in the right order doesn't work - in some
> > > cases the caller of kmalloc or alloc_pages could be holding a lock
> > > that we'd need to unlock first. Hence why the shrinker uses the best-
> effort-might-fail solution of trylocks, encoded in shrinker_lock.
> > >
> > > But for lmem we don't have such an excuse, because it's all our own code.
> > > The locking design can (and should!) assume that it can get out of
> > > any deadlock and always acquire all the locks it needs. Without that
> > > you can't achive the first part about guaranteeing execution of
> > > batches which collectively need more than 100% of lmem, but
> > > individually all fit. As an example if you look at the amdgpu
> > > command submission ioctl, that passes around ttm_operation_ctx which
> > > tracks a few things about locks and other bits, and if they hit a
> > > possible deadlock situation they can unwind the entire CS and restart by
> taking the locks in the right order.
> >
> > Thank you for the explanation.
> >
> > What does our 'struct_mutex' protect for exactly?  As example, I see when
> blitter engine functions are called, we hold 'struct_mutex" first.
> >
> > Can we replace 'struct_mutex' with some fine-grain locks so that we can
> lock obj->mm.lock first, and then lock these fine-grain locks?
> 
> Sure. With lots of efforts.
> 
> > I need some background info about 'struct_mutex' design.
> 
> There's not really a design behind it, it's just 10+ years of evolution.

Yes, in old days, a big coarse-grain lock was OK. Now with so many engines in new hardware for new computation workloads, we might need to do fine-grain locks

--CQ

> -Daniel
> 
> > --CQ
> >
> > >
> > > I thought I typed that up somewhere, but I guess it got lost ...
> > >
> > > Cheers, Daniel
> > >
> > > >
> > > > >
> > > > > But I might be entirely off the track with reconstructing how
> > > > > this code came to be, so please elaborate a bit.
> > > > >
> > > > > Thanks, Daniel
> > >
> > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx