Re: [PATCH] drm/i915: Fix i915_gem_evict_for_vma (soft-pinning)

Daniel Vetter <daniel@xxxxxxxx> · Wed, 16 Nov 2016 15:41:53 +0100

On Wed, Nov 16, 2016 at 08:58:30AM +0000, Chris Wilson wrote:
> Soft-pinning depends upon being able to check for availabilty of an
> interval and evict overlapping object from a drm_mm range manager very
> quickly. Currently it uses a linear list, and so performance is dire and
> not suitable as a general replacement. Worse, the current code will oops
> if it tries to evict an active buffer.
> 
> It also helps if the routine reports the correct error codes as expected
> by its callers and emits a tracepoint upon use.
> 
> For posterity since the wrong patch was pushed (i.e. that missed these
> key points and had known bugs), this is the changelog that should have
> been on commit 506a8e87d8d2 ("drm/i915: Add soft-pinning API for
> execbuffer"):
> 
> Userspace can pass in an offset that it presumes the object is located
> at. The kernel will then do its utmost to fit the object into that
> location. The assumption is that userspace is handling its own object
> locations (for example along with full-ppgtt) and that the kernel will
> rarely have to make space for the user's requests.
> 
> This extends the DRM_IOCTL_I915_GEM_EXECBUFFER2 to do the following:
> * if the user supplies a virtual address via the execobject->offset
>   *and* sets the EXEC_OBJECT_PINNED flag in execobject->flags, then
>   that object is placed at that offset in the address space selected
>   by the context specifier in execbuffer.
> * the location must be aligned to the GTT page size, 4096 bytes
> * as the object is placed exactly as specified, it may be used by this
>   execbuffer call without relocations pointing to it
> 
> It may fail to do so if:
> * EINVAL is returned if the object does not have a 4096 byte aligned
>   address
> * the object conflicts with another pinned object (either pinned by
>   hardware in that address space, e.g. scanouts in the aliasing ppgtt)
>   or within the same batch.
>   EBUSY is returned if the location is pinned by hardware
>   EINVAL is returned if the location is already in use by the batch
> * EINVAL is returned if the object conflicts with its own alignment (as meets
>   the hardware requirements) or if the placement of the object does not fit
>   within the address space
> 
> All other execbuffer errors apply.
> 
> Presence of this execbuf extension may be queried by passing
> I915_PARAM_HAS_EXEC_SOFTPIN to DRM_IOCTL_I915_GETPARAM and checking for
> a reported value of 1 (or greater).
> 
> Fixes: 506a8e87d8d2 ("drm/i915: Add soft-pinning API for execbuffer")
> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> ---
>  drivers/gpu/drm/drm_mm.c                   |  9 ++++
>  drivers/gpu/drm/i915/gvt/aperture_gm.c     |  4 +-
>  drivers/gpu/drm/i915/i915_drv.h            |  3 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c      | 87 +++++++++++++++++++++---------
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  3 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.c        | 10 ++--
>  drivers/gpu/drm/i915/i915_trace.h          | 23 ++++++++
>  drivers/gpu/drm/i915/i915_vma.c            |  8 +--
>  8 files changed, 105 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
> index 632473beb40c..06f319e54dc1 100644
> --- a/drivers/gpu/drm/drm_mm.c
> +++ b/drivers/gpu/drm/drm_mm.c
> @@ -339,6 +339,15 @@ int drm_mm_reserve_node(struct drm_mm *mm, struct drm_mm_node *node)
>  	if (hole_start > node->start || hole_end < end)
>  		return -ENOSPC;
>  
> +	if (mm->color_adjust) {
> +		u64 adj_start = hole_start, adj_end = hole_end;
> +
> +		mm->color_adjust(hole, node->color, &adj_start, &adj_end);
> +		if (adj_start > node->start ||
> +		    adj_end < node->start + node->size)
> +			return -ENOSPC;
> +	}

Can't we move that up and combine with the earlier check like in other
places, i.e.

    	u64 adj_start = hole_start, adj_end = hole_end;

	if (mm->color_ajust)
		mm->color_adjust(hole, node->color, &adj_start, &adj_end);
    	if (adj_start > node->start ||
    	    adj_end < node->start + node->size)
    		return -ENOSPC;

gcc can then optimize this if it sees fit.

>  	node->mm = mm;
>  	node->allocated = 1;

And we should split out the drm_mm bugfix into a separate patch.

>  
> diff --git a/drivers/gpu/drm/i915/gvt/aperture_gm.c b/drivers/gpu/drm/i915/gvt/aperture_gm.c
> index 0d41ebc4aea6..46e66feaef32 100644
> --- a/drivers/gpu/drm/i915/gvt/aperture_gm.c
> +++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c
> @@ -73,12 +73,12 @@ static int alloc_gm(struct intel_vgpu *vgpu, bool high_gm)
>  	mutex_lock(&dev_priv->drm.struct_mutex);
>  search_again:
>  	ret = drm_mm_insert_node_in_range_generic(&dev_priv->ggtt.base.mm,
> -						  node, size, 4096, 0,
> +						  node, size, 4096, -1,
>  						  start, end, search_flag,
>  						  alloc_flag);
>  	if (ret) {
>  		ret = i915_gem_evict_something(&dev_priv->ggtt.base,
> -					       size, 4096, 0, start, end, 0);
> +					       size, 4096, -1, start, end, 0);

Bugfix for gvt coloring? Should we have a GVT_COLOR? Feels like separate
patch once more.

>  		if (ret == 0 && ++retried < 3)
>  			goto search_again;
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 006914ca36fe..fc3fedb99ef2 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3243,7 +3243,8 @@ int __must_check i915_gem_evict_something(struct i915_address_space *vm,
>  					  unsigned cache_level,
>  					  u64 start, u64 end,
>  					  unsigned flags);
> -int __must_check i915_gem_evict_for_vma(struct i915_vma *target);
> +int __must_check i915_gem_evict_for_vma(struct i915_vma *vma,
> +					unsigned int flags);
>  int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle);
>  
>  /* belongs in i915_gem_gtt.h */
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index bd08814b015c..094ca96843a6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -212,45 +212,82 @@ i915_gem_evict_something(struct i915_address_space *vm,
>  	return ret;
>  }
>  
> -int
> -i915_gem_evict_for_vma(struct i915_vma *target)
> +int i915_gem_evict_for_vma(struct i915_vma *target, unsigned int flags)

Kerneldoc for this one here is missing.

>  {
> -	struct drm_mm_node *node, *next;
> +	struct list_head eviction_list;
> +	struct drm_mm_node *node;
> +	u64 start = target->node.start;
> +	u64 end = start + target->node.size - 1;
> +	struct i915_vma *vma, *next;
> +	bool check_snoop;
> +	int ret = 0;
>  
>  	lockdep_assert_held(&target->vm->dev->struct_mutex);
> +	trace_i915_gem_evict_vma(target, flags);
> +
> +	check_snoop = target->vm->mm.color_adjust;
> +	if (check_snoop) {
> +		if (start > target->vm->start)
> +			start -= 4096;
> +		if (end < target->vm->start + target->vm->total)
> +			end += 4096;
> +	}
>  
> -	list_for_each_entry_safe(node, next,
> -			&target->vm->mm.head_node.node_list,
> -			node_list) {
> -		struct i915_vma *vma;
> -		int ret;
> +	node = drm_mm_interval_first(&target->vm->mm, start, end);

drm_mm_interval_first is lacking some pretty kernel-doc, should be fixed!

> +	if (!node)
> +		return 0;
>  
> -		if (node->start + node->size <= target->node.start)
> -			continue;
> -		if (node->start >= target->node.start + target->node.size)
> +	INIT_LIST_HEAD(&eviction_list);
> +	vma = container_of(node, typeof(*vma), node);
> +	list_for_each_entry_from(vma,
> +				 &target->vm->mm.head_node.node_list,
> +				 node.node_list) {
> +		if (vma->node.start > end)
>  			break;

We have this nice drm_mm_interval_next helper, imo should use it here
instead of open-coding.
>  
> -		vma = container_of(node, typeof(*vma), node);
> +		if (check_snoop) {
> +			if (vma->node.start + vma->node.size == target->node.start) {
> +				if (vma->node.color == target->node.color)
> +					continue;
> +			}
> +			if (vma->node.start == target->node.start + target->node.size) {
> +				if (vma->node.color == target->node.color)
> +					continue;
> +			}
> +		}

Not feeling too happy with open-coding our color_adjust here.

>  
> -		if (i915_vma_is_pinned(vma)) {
> -			if (!vma->exec_entry || i915_vma_pin_count(vma) > 1)
> -				/* Object is pinned for some other use */
> -				return -EBUSY;
> +		if (vma->node.color == -1) {

The magic -1!

> +			ret = -ENOSPC;
> +			break;
> +		}
>  
> -			/* We need to evict a buffer in the same batch */
> -			if (vma->exec_entry->flags & EXEC_OBJECT_PINNED)
> -				/* Overlapping fixed objects in the same batch */
> -				return -EINVAL;
> +		if (flags & PIN_NONBLOCK &&
> +		    (i915_vma_is_pinned(vma) || i915_vma_is_active(vma))) {
> +			ret = -ENOSPC;
> +			break;
> +		}
>  
> -			return -ENOSPC;
> +		/* Overlap of objects in the same batch? */
> +		if (i915_vma_is_pinned(vma)) {
> +			ret = -ENOSPC;
> +			if (vma->exec_entry &&
> +			    vma->exec_entry->flags & EXEC_OBJECT_PINNED)
> +				ret = -EINVAL;
> +			break;
>  		}
>  
> -		ret = i915_vma_unbind(vma);
> -		if (ret)
> -			return ret;
> +		__i915_vma_pin(vma);
> +		list_add(&vma->exec_list, &eviction_list);
>  	}
>  
> -	return 0;
> +	list_for_each_entry_safe(vma, next, &eviction_list, exec_list) {
> +		list_del_init(&vma->exec_list);
> +		__i915_vma_unpin(vma);
> +		if (ret == 0)
> +			ret = i915_vma_unbind(vma);
> +	}
> +
> +	return ret;
>  }
>  
>  /**
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index e804cb2fa57e..1905e4419b1f 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -274,6 +274,7 @@ static void eb_destroy(struct eb_vmas *eb)
>  				       exec_list);
>  		list_del_init(&vma->exec_list);
>  		i915_gem_execbuffer_unreserve_vma(vma);
> +		vma->exec_entry = NULL;
>  		i915_vma_put(vma);
>  	}
>  	kfree(eb);
> @@ -437,7 +438,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj,
>  			memset(&cache->node, 0, sizeof(cache->node));
>  			ret = drm_mm_insert_node_in_range_generic
>  				(&ggtt->base.mm, &cache->node,
> -				 4096, 0, 0,
> +				 4096, 0, -1,

More magic color, please paint in some nice bikeshed ;-)

>  				 0, ggtt->mappable_end,
>  				 DRM_MM_SEARCH_DEFAULT,
>  				 DRM_MM_CREATE_DEFAULT);
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 01f238adfb67..bae4e9d8f682 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2072,16 +2072,14 @@ static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
>  		return ret;
>  
>  alloc:
> -	ret = drm_mm_insert_node_in_range_generic(&ggtt->base.mm,
> -						  &ppgtt->node, GEN6_PD_SIZE,
> -						  GEN6_PD_ALIGN, 0,
> -						  0, ggtt->base.total,
> +	ret = drm_mm_insert_node_in_range_generic(&ggtt->base.mm, &ppgtt->node,
> +						  GEN6_PD_SIZE, GEN6_PD_ALIGN,
> +						  -1, 0, ggtt->base.total,

Yet another magic -1 color. Definitely want a define for this ...

>  						  DRM_MM_TOPDOWN);
>  	if (ret == -ENOSPC && !retried) {
>  		ret = i915_gem_evict_something(&ggtt->base,
>  					       GEN6_PD_SIZE, GEN6_PD_ALIGN,
> -					       I915_CACHE_NONE,
> -					       0, ggtt->base.total,
> +					       -1, 0, ggtt->base.total,
>  					       0);
>  		if (ret)
>  			goto err_out;
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index c5d210ebaa9a..2ed60ed70fe1 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -450,6 +450,29 @@ TRACE_EVENT(i915_gem_evict_vm,
>  	    TP_printk("dev=%d, vm=%p", __entry->dev, __entry->vm)
>  );
>  
> +TRACE_EVENT(i915_gem_evict_vma,
> +	    TP_PROTO(struct i915_vma *vma, unsigned int flags),
> +	    TP_ARGS(vma, flags),
> +
> +	    TP_STRUCT__entry(
> +			     __field(u32, dev)
> +			     __field(struct i915_address_space *, vm)
> +			     __field(u64, start)
> +			     __field(u64, size)
> +			     __field(unsigned int, flags)
> +			    ),
> +
> +	    TP_fast_assign(
> +			   __entry->dev = vma->vm->dev->primary->index;
> +			   __entry->vm = vma->vm;
> +			   __entry->start = vma->node.start;
> +			   __entry->size = vma->node.size;
> +			   __entry->flags = flags;
> +			  ),
> +
> +	    TP_printk("dev=%d, vm=%p, start=%llx size=%llx, flags=%x", __entry->dev, __entry->vm, (long long)__entry->start, (long long)__entry->size, __entry->flags)
> +);
> +
>  TRACE_EVENT(i915_gem_ring_sync_to,
>  	    TP_PROTO(struct drm_i915_gem_request *to,
>  		     struct drm_i915_gem_request *from),
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 738ff3a5cd6e..827eaeb75524 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -325,12 +325,6 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma,
>  	if (vma->vm->mm.color_adjust == NULL)
>  		return true;
>  
> -	if (!drm_mm_node_allocated(gtt_space))
> -		return true;
> -
> -	if (list_empty(&gtt_space->node_list))
> -		return true;

Not clear to me why we need this: set_cache_level should never see
pre-filled vmas, and vma_insert only has this check in the success case
where the same should hold. Why remove these 2 cases? This seems to change
set_cache_level behaviour ...

Otherwise looks all good, but needs some splitting&polish imo.
-Daniel

> -
>  	other = list_entry(gtt_space->node_list.prev, struct drm_mm_node, node_list);
>  	if (other->allocated && !other->hole_follows && other->color != cache_level)
>  		return false;
> @@ -413,7 +407,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
>  		vma->node.color = obj->cache_level;
>  		ret = drm_mm_reserve_node(&vma->vm->mm, &vma->node);
>  		if (ret) {
> -			ret = i915_gem_evict_for_vma(vma);
> +			ret = i915_gem_evict_for_vma(vma, flags);
>  			if (ret == 0)
>  				ret = drm_mm_reserve_node(&vma->vm->mm, &vma->node);
>  			if (ret)
> -- 
> 2.10.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx