Quoting Tvrtko Ursulin (2018-07-10 13:19:51) > > On 09/07/2018 14:02, Chris Wilson wrote: > > Our goal is to remove struct_mutex and replace it with fine grained > > locking. One of the thorny issues is our eviction logic for reclaiming > > space for an execbuffer (or GTT mmaping, among a few other examples). > > While eviction itself is easy to move under a per-VM mutex, performing > > the activity tracking is less agreeable. One solution is not to do any > > MRU tracking and do a simple coarse evaluation during eviction of > > active/inactive, with a loose temporal ordering of last > > insertion/evaluation. That keeps all the locking constrained to when we > > are manipulating the VM itself, neatly avoiding the tricky handling of > > possible recursive locking during execbuf and elsewhere. > > > > Note that discarding the MRU is unlikely to impact upon our efficiency > > to reclaim VM space (where we think a LRU model is best) as our > > current strategy is to use random idle replacement first before doing > > a search, and over time the use of softpinned 48b per-ppGTT is growing > > (thereby eliminating any need to perform any eviction searches, in > > theory at least). > > It's a big change to eviction logic but also mostly affects GGTT which > is diminishing in significance. But we still probably need some real > world mmap_gtt users to benchmark and general 3d pre-ppgtt. mmap_gtt is not really significant here as we use random replacement almost exclusively for them. Any workload that relies on thrashing is a lost cause more or less; we can not implement an optimal eviction strategy (no fore knowledge) and even MRU is suggested by some of the literature as not being a good strategy for graphics (the evidence in games are that the MRU reused surfaces in a frame are typically use-once whereas the older surfaces are typically used at the start of each frame; it is those use once surfaces that then distort the MRU into making a bad prediction). Fwiw, you will never get a demo with a >GTT working set that isn't a slideshow, simply because we don't have the memory bw :-p (Those machines have ~6GB/s main memory bw, a few fold higher while in the LLC cache, but not enough to sustain ~180GB/s for the example, even ignoring all the other ratelimiting steps.) > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c > > index 02b83a5ed96c..6a3608398d2a 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c > > @@ -127,14 +127,10 @@ i915_gem_evict_something(struct i915_address_space *vm, > > struct drm_i915_private *dev_priv = vm->i915; > > struct drm_mm_scan scan; > > struct list_head eviction_list; > > - struct list_head *phases[] = { > > - &vm->inactive_list, > > - &vm->active_list, > > - NULL, > > - }, **phase; > > struct i915_vma *vma, *next; > > struct drm_mm_node *node; > > enum drm_mm_insert_mode mode; > > + struct i915_vma *active; > > int ret; > > > > lockdep_assert_held(&vm->i915->drm.struct_mutex); > > @@ -170,17 +166,31 @@ i915_gem_evict_something(struct i915_address_space *vm, > > */ > > if (!(flags & PIN_NONBLOCK)) > > i915_retire_requests(dev_priv); > > - else > > - phases[1] = NULL; > > > > There are comments around here referring to active/inactive lists which > will need updating/removing. > > > search_again: > > + active = NULL; > > INIT_LIST_HEAD(&eviction_list); > > - phase = phases; > > - do { > > - list_for_each_entry(vma, *phase, vm_link) > > - if (mark_free(&scan, vma, flags, &eviction_list)) > > - goto found; > > - } while (*++phase); > > + list_for_each_entry_safe(vma, next, &vm->bound_list, vm_link) { > > + if (i915_vma_is_active(vma)) { > > + if (vma == active) { > > + if (flags & PIN_NONBLOCK) > > + break; > > + > > + active = ERR_PTR(-EAGAIN); > > + } > > + > > + if (active != ERR_PTR(-EAGAIN)) { > > + if (!active) > > + active = vma; > > + > > + list_move_tail(&vma->vm_link, &vm->bound_list); > > + continue; > > + } > > + } > > + > > + if (mark_free(&scan, vma, flags, &eviction_list)) > > + goto found; > > + } > > This loop need a narrative to explain the process. active/inactive doesn't cover it? The comments still make sense to me explaining the code flow. There's only the conceptual juggle that the two lists are combined into one. > > @@ -996,10 +993,8 @@ int i915_vma_move_to_active(struct i915_vma *vma, > > * add the active reference first and queue for it to be dropped > > * *last*. > > */ > > - if (!i915_gem_active_isset(active) && !vma->active_count++) { > > - list_move_tail(&vma->vm_link, &vma->vm->active_list); > > It is not workable to bump it to the head of bound list here? See the opening statement of intent: killing the list_move here is something that appears on the profiles for everyone but very rarely used (and never for the softpinned cases). -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx