Re: [PATCH v2 03/37] drm/i915/region: support basic eviction

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Thu, 15 Aug 2019 16:26:07 +0100

Quoting Matthew Auld (2019-08-15 11:48:04)
> On Tue, 30 Jul 2019 at 17:26, Daniel Vetter <daniel@xxxxxxxx> wrote:
> >
> > On Thu, Jun 27, 2019 at 09:55:59PM +0100, Matthew Auld wrote:
> > > Support basic eviction for regions.
> > >
> > > Signed-off-by: Matthew Auld <matthew.auld@xxxxxxxxx>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
> > > Cc: Abdiel Janulgue <abdiel.janulgue@xxxxxxxxxxxxxxx>
> >
> > So from a very high level this looks like it was largely modelled after
> > i915_gem_shrink.c and not i915_gem_evict.c (our other "make room, we're
> > running out of stuff" code). Any specific reasons?
> 
> IIRC I think it was originally based on the patches that exposed
> stolen-memory to userspace from a few years ago.
> 
> >
> > I think i915_gem_evict is a lot closer match for what we want for vram (it
> > started out to manage severely limitted GTT on gen2/3/4) after all. With
> > the complication that we'll have to manage physical memory with multiple
> > virtual mappings of it on top, so unfortunately we can't just reuse the
> > locking patter Chris has come up with in his struct_mutex-removal branch.
> > But at least conceptually it should be a lot closer.
> 
> When you say make it more like i915_gem_evict, what does that mean?
> Are you talking about the eviction roster stuff, or the
> placement/locking of the eviction logic, with it being deep down in
> get_pages?

The biggest difference would be the lack of region coalescing; the
eviction code only tries to free what would result in a successful
allocation. With the order being put into the scanner somewhat relevant,
in practice, fragmentation effects cause the range search to be somewhat
slow and we much prefer the random replacement -- while harmful, it is
not biased as to who it harms, and so is consistent overhead. However,
since you don't need to find a slot inside a small range within a few
million objects, I would expect LRU or even MRU (recently used objects
in games tend to be more ephemeral and so made good eviction targets, at
least according to John Carmack back in the day) to require fewer major
faults.
https://github.com/ESWAT/john-carmack-plan-archive/blob/master/by_day/johnc_plan_20000307.txt

You would need a very similar scanner to keep a journal of the potential
frees from which to track the coalescing (slightly more complicated due
to the disjoint nature of the buddy merges). One suspects that adding
the scanner would shape the buddy_nodes more towards drm_mm_nodes.

This is also a case where real world testing of a thrashing load beats
simulation.  So just make sure the eviction doesn't stall the entire GPU
and submission pipeline and you will be forgiven most transgressions.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx