To no surprise (since we've flip-flopped over the use of PIN_HIGH a few times), doing a search by address over a pathologically fragmented address space is exceeding slow. To protect ourselves from nearly unbounded latency (think searching a million holes while under struct_mutex), limit the search for the highest available hole and fallback to best-fit if it fails. In the pathologically fragmented case, such as igt/gem_ctx_thrash, the effect is dramatic, bringing the runtime down from hours to seconds (depending on how many other slow searches you hit, e.g. alloc_iova() and alloc_vmap_area() both degrade to a slow rbtree walk after their small cache is exhausted). For the real world, the number of search steps is unlikely to be significant as we should only need to search once per new context. Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> --- drivers/gpu/drm/i915/i915_gem_gtt.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 996ab2ad6c45..fcf3a973925b 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -3988,7 +3988,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, mode = DRM_MM_INSERT_BEST; if (flags & PIN_HIGH) - mode = DRM_MM_INSERT_HIGH; + mode = DRM_MM_INSERT_HIGHEST; if (flags & PIN_MAPPABLE) mode = DRM_MM_INSERT_LOW; @@ -4008,6 +4008,15 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, if (err != -ENOSPC) return err; + if (mode & DRM_MM_INSERT_ONCE) { + err = drm_mm_insert_node_in_range(&vm->mm, node, + size, alignment, color, + start, end, + DRM_MM_INSERT_BEST); + if (err != -ENOSPC) + return err; + } + if (flags & PIN_NOEVICT) return -ENOSPC; -- 2.17.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx