On Wed, Nov 16, 2016 at 02:55:59PM +0000, Chris Wilson wrote: > On Wed, Nov 16, 2016 at 03:41:53PM +0100, Daniel Vetter wrote: > > On Wed, Nov 16, 2016 at 08:58:30AM +0000, Chris Wilson wrote: > > > Soft-pinning depends upon being able to check for availabilty of an > > > interval and evict overlapping object from a drm_mm range manager very > > > quickly. Currently it uses a linear list, and so performance is dire and > > > not suitable as a general replacement. Worse, the current code will oops > > > if it tries to evict an active buffer. > > > > > > It also helps if the routine reports the correct error codes as expected > > > by its callers and emits a tracepoint upon use. > > > > > > For posterity since the wrong patch was pushed (i.e. that missed these > > > key points and had known bugs), this is the changelog that should have > > > been on commit 506a8e87d8d2 ("drm/i915: Add soft-pinning API for > > > execbuffer"): > > > > > > Userspace can pass in an offset that it presumes the object is located > > > at. The kernel will then do its utmost to fit the object into that > > > location. The assumption is that userspace is handling its own object > > > locations (for example along with full-ppgtt) and that the kernel will > > > rarely have to make space for the user's requests. > > > > > > This extends the DRM_IOCTL_I915_GEM_EXECBUFFER2 to do the following: > > > * if the user supplies a virtual address via the execobject->offset > > > *and* sets the EXEC_OBJECT_PINNED flag in execobject->flags, then > > > that object is placed at that offset in the address space selected > > > by the context specifier in execbuffer. > > > * the location must be aligned to the GTT page size, 4096 bytes > > > * as the object is placed exactly as specified, it may be used by this > > > execbuffer call without relocations pointing to it > > > > > > It may fail to do so if: > > > * EINVAL is returned if the object does not have a 4096 byte aligned > > > address > > > * the object conflicts with another pinned object (either pinned by > > > hardware in that address space, e.g. scanouts in the aliasing ppgtt) > > > or within the same batch. > > > EBUSY is returned if the location is pinned by hardware > > > EINVAL is returned if the location is already in use by the batch > > > * EINVAL is returned if the object conflicts with its own alignment (as meets > > > the hardware requirements) or if the placement of the object does not fit > > > within the address space > > > > > > All other execbuffer errors apply. > > > > > > Presence of this execbuf extension may be queried by passing > > > I915_PARAM_HAS_EXEC_SOFTPIN to DRM_IOCTL_I915_GETPARAM and checking for > > > a reported value of 1 (or greater). > > > > > > Fixes: 506a8e87d8d2 ("drm/i915: Add soft-pinning API for execbuffer") > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > --- > > > drivers/gpu/drm/drm_mm.c | 9 ++++ > > > drivers/gpu/drm/i915/gvt/aperture_gm.c | 4 +- > > > drivers/gpu/drm/i915/i915_drv.h | 3 +- > > > drivers/gpu/drm/i915/i915_gem_evict.c | 87 +++++++++++++++++++++--------- > > > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +- > > > drivers/gpu/drm/i915/i915_gem_gtt.c | 10 ++-- > > > drivers/gpu/drm/i915/i915_trace.h | 23 ++++++++ > > > drivers/gpu/drm/i915/i915_vma.c | 8 +-- > > > 8 files changed, 105 insertions(+), 42 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c > > > index 632473beb40c..06f319e54dc1 100644 > > > --- a/drivers/gpu/drm/drm_mm.c > > > +++ b/drivers/gpu/drm/drm_mm.c > > > @@ -339,6 +339,15 @@ int drm_mm_reserve_node(struct drm_mm *mm, struct drm_mm_node *node) > > > if (hole_start > node->start || hole_end < end) > > > return -ENOSPC; > > > > > > + if (mm->color_adjust) { > > > + u64 adj_start = hole_start, adj_end = hole_end; > > > + > > > + mm->color_adjust(hole, node->color, &adj_start, &adj_end); > > > + if (adj_start > node->start || > > > + adj_end < node->start + node->size) > > > + return -ENOSPC; > > > + } > > > > Can't we move that up and combine with the earlier check like in other > > places, i.e. > > Unlike the other places we don't use adj_start, adj_end. I meant the duplication of the if (does_it_fit) return -ENOSPC; snippet, and imo unifying that makes the code more readable irrespective of whether we need adj_start/end later on. And would be more consistent with other places in drm_mm.c. > > > --- a/drivers/gpu/drm/i915/gvt/aperture_gm.c > > > +++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c > > > @@ -73,12 +73,12 @@ static int alloc_gm(struct intel_vgpu *vgpu, bool high_gm) > > > mutex_lock(&dev_priv->drm.struct_mutex); > > > search_again: > > > ret = drm_mm_insert_node_in_range_generic(&dev_priv->ggtt.base.mm, > > > - node, size, 4096, 0, > > > + node, size, 4096, -1, > > > start, end, search_flag, > > > alloc_flag); > > > if (ret) { > > > ret = i915_gem_evict_something(&dev_priv->ggtt.base, > > > - size, 4096, 0, start, end, 0); > > > + size, 4096, -1, start, end, 0); > > > > Bugfix for gvt coloring? Should we have a GVT_COLOR? Feels like separate > > patch once more. > > No, copy paste of broken code. The pattern was broken by evict_for_vma. > > > > -int > > > -i915_gem_evict_for_vma(struct i915_vma *target) > > > +int i915_gem_evict_for_vma(struct i915_vma *target, unsigned int flags) > > > > Kerneldoc for this one here is missing. > > There should be exactly one caller. It is not an interface that should > be used. Same holds for others, I just think it's a good place to put some documentation for how this is supposed to work. At least personally I do believe a bit more prose does make the code more approachable, but if the i915 gem gang deems them useless we can forgo them too. > > > { > > > - struct drm_mm_node *node, *next; > > > + struct list_head eviction_list; > > > + struct drm_mm_node *node; > > > + u64 start = target->node.start; > > > + u64 end = start + target->node.size - 1; > > > + struct i915_vma *vma, *next; > > > + bool check_snoop; > > > + int ret = 0; > > > > > > lockdep_assert_held(&target->vm->dev->struct_mutex); > > > + trace_i915_gem_evict_vma(target, flags); > > > + > > > + check_snoop = target->vm->mm.color_adjust; > > > + if (check_snoop) { > > > + if (start > target->vm->start) > > > + start -= 4096; > > > + if (end < target->vm->start + target->vm->total) > > > + end += 4096; > > > + } > > > > > > - list_for_each_entry_safe(node, next, > > > - &target->vm->mm.head_node.node_list, > > > - node_list) { > > > - struct i915_vma *vma; > > > - int ret; > > > + node = drm_mm_interval_first(&target->vm->mm, start, end); > > > > drm_mm_interval_first is lacking some pretty kernel-doc, should be fixed! > > > > > + if (!node) > > > + return 0; > > > > > > - if (node->start + node->size <= target->node.start) > > > - continue; > > > - if (node->start >= target->node.start + target->node.size) > > > + INIT_LIST_HEAD(&eviction_list); > > > + vma = container_of(node, typeof(*vma), node); > > > + list_for_each_entry_from(vma, > > > + &target->vm->mm.head_node.node_list, > > > + node.node_list) { > > > + if (vma->node.start > end) > > > break; > > > > We have this nice drm_mm_interval_next helper, imo should use it here > > instead of open-coding. > > It's not the same. This is a linear walk which is fewer pointer dances. Shouldn't we fix up drm_mm_interval_next then to be linear? drm_mm never has overlapping intervals, so the full r-b search is overkill. > > > - vma = container_of(node, typeof(*vma), node); > > > + if (check_snoop) { > > > + if (vma->node.start + vma->node.size == target->node.start) { > > > + if (vma->node.color == target->node.color) > > > + continue; > > > + } > > > + if (vma->node.start == target->node.start + target->node.size) { > > > + if (vma->node.color == target->node.color) > > > + continue; > > > + } > > > + } > > > > Not feeling too happy with open-coding our color_adjust here. > > Good, because it doesn't. Ok, what does it do then? It looked like it makes sure the guard page is there if needed, which is what our color_adjust does too. > > > - if (i915_vma_is_pinned(vma)) { > > > - if (!vma->exec_entry || i915_vma_pin_count(vma) > 1) > > > - /* Object is pinned for some other use */ > > > - return -EBUSY; > > > + if (vma->node.color == -1) { > > > > The magic -1! > > And now you see why it is absolutely required. And not any more magic > than any other color value. #define MAGIC -1 For a suitable name of MAGIC is what I meant of course ;-) > > > diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c > > > index 738ff3a5cd6e..827eaeb75524 100644 > > > --- a/drivers/gpu/drm/i915/i915_vma.c > > > +++ b/drivers/gpu/drm/i915/i915_vma.c > > > @@ -325,12 +325,6 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, > > > if (vma->vm->mm.color_adjust == NULL) > > > return true; > > > > > > - if (!drm_mm_node_allocated(gtt_space)) > > > - return true; > > > - > > > - if (list_empty(>t_space->node_list)) > > > - return true; > > > > Not clear to me why we need this: set_cache_level should never see > > pre-filled vmas, and vma_insert only has this check in the success case > > where the same should hold. Why remove these 2 cases? This seems to change > > set_cache_level behaviour ... > > Because they're irrelevant. We only get called when bound and the second > is not our bug. > > There's no change in caller behaviour. Yeah, but then the change seems unrelated to the bugfix here and imo should be split out. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx