On Fri, Aug 05, 2016 at 09:08:34PM +0200, Daniel Vetter wrote: > On Fri, Aug 05, 2016 at 10:14:18AM +0100, Chris Wilson wrote: > > By applying the same logic as for wait-ioctl, we can query whether a > > request has completed without holding struct_mutex. The biggest impact > > system-wide is removing the flush_active and the contention that causes. > > > > Testcase: igt/gem_busy > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Akash Goel <akash.goel@xxxxxxxxx> > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/i915/i915_gem.c | 131 +++++++++++++++++++++++++++++++--------- > > 1 file changed, 101 insertions(+), 30 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index ceb00970b2da..b99d64bfb7eb 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -3736,49 +3736,120 @@ i915_gem_object_ggtt_unpin_view(struct drm_i915_gem_object *obj, > > i915_vma_unpin(i915_gem_obj_to_ggtt_view(obj, view)); > > } > > > > +static __always_inline unsigned __busy_read_flag(unsigned int id) > > +{ > > + /* Note that we could alias engines in the execbuf API, but > > + * that would be very unwise as it prevents userspace from > > + * fine control over engine selection. Ahem. > > + * > > + * This should be something like EXEC_MAX_ENGINE instead of > > + * I915_NUM_ENGINES. > > + */ > > + BUILD_BUG_ON(I915_NUM_ENGINES > 16); > > + return 0x10000 << id; > > +} > > + > > +static __always_inline unsigned int __busy_write_id(unsigned int id) > > +{ > > + return id; > > +} > > + > > +static __always_inline unsigned > > +__busy_set_if_active(const struct i915_gem_active *active, > > + unsigned int (*flag)(unsigned int id)) > > +{ > > + /* For more discussion about the barriers and locking concerns, > > + * see __i915_gem_active_get_rcu(). > > + */ > > + do { > > + struct drm_i915_gem_request *request; > > + unsigned int id; > > + > > + request = rcu_dereference(active->request); > > + if (!request || i915_gem_request_completed(request)) > > + return 0; > > + > > + id = request->engine->exec_id; > > + > > + /* Check that the pointer wasn't reassigned and overwritten. */ > > cf. our discussion in active_get_rcu - there's no fence_get_rcu in sight > anywhere here, hence this needs an smp_rmb(). I toyed with smp_rmb(). The rcu_deference() followed by rcu_access_pointer() is ordered. So I was back with dancing around "where the dependent-reads ordered by the first rcu_deference ordered in front of the second access which was itself ordered after the first?" I probably should have stuck in the smp_rmb() and stopped worrying - it is still going to be cheaper than the refcount traffic. > Also nitpick: The two > rcu_dereference(actove->request) feel a bit silly. If we move the first in > front of the loop, and update the local request pointer (using a tmp) it > would look tidier, and we could even move the loop termination condition > into the while () check (and move the return flag(id) at the end of the > function). I was quite content with only having to think of one phase through the loop and not worry about state being carried forward. __busy_set_if_active(const struct i915_gem_active *active, unsigned int (*flag)(unsigned int id)) { + struct drm_i915_gem_request *request; + unsigned int id; + /* For more discussion about the barriers and locking concerns, * see __i915_gem_active_get_rcu(). */ + request = rcu_dereference(active->request); do { - struct drm_i915_gem_request *request; - unsigned int id; + struct drm_i915_gem_request *tmp; - request = rcu_dereference(active->request); if (!request || i915_gem_request_completed(request)) return 0; id = request->engine->exec_id; /* Check that the pointer wasn't reassigned and overwritten. */ - if (request == rcu_access_pointer(active->request)) - return flag(id); + tmp = rcu_dereference(active->request); + if (tmp == request) + break; + + request = tmp; } while (1); + + return flag(id); } is also not as well optimised by gcc, apparently. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx